" Concentration of Chemicals of Concern

in the Soil Matrix for the Marcus Hook Refinery"

Term Project

CE394K GIS in Water Resources

By: Aiza Jose

aizah@mail.utexas.edu


                 (1)  Non-interpolative methods
                       1.1 Cell Method
                       1.2 Zone of Influence Method
                       1.3 The Thiessen Polygon
                 (2)  Interpolative methods 
                       2.1 Inverse Distance Weight Method
                       2.2 Spline Method


OBJECTIVE

To analyze the application of different tools in Arc View and Arc/Info for the contamination assessment at the Marcus Hook Refinery. The graphical representation of levels for different Chemicals of Concern  is analyzed under different approaches and presented here as a complementary effort to that being developed by a group of researches at the University of Texas at Austin in the corrective action of the facility.  To accomplish the objective a calculation and graphical representation in Arc View of the areal average concentration of Chemicals of Concern (COC) in the soil matrix will be elaborated. An special emphasis is made on the appropriateness of the usage of the different analytical tools.
 


BACKGROUND

Currently, a group of researches from the University of Texas at Austin, directed by Dr. David Maidment and Dr. Robert Gilbert,  are collaborating in a project for the remediation of contamination at the Marcus Hook Refinery. The group is providing support to the corrective action project managers in the development of a model that allows the use of a risk based decision making approach for the contaminated site.

The Marcus Hook Refinery is located in Marcus Hook, Pennsylvania, adjacent to the Delaware River. It was operated for many years by British Petroleum, but was sold a couple of years ago to Tosco. At the time of the sale, British Petroleum undertook all the responsibility for the remediation of contamination at the site, which includes unregulated spills occurred at the beginning of this century. Currently, the remediation of the site is under analysis in cooperation between the BP Environmental Remediation office in Atlanta, GA, representatives of RUST environmental services in Pennsylvania and researchers from the University of Texas at Austin.
 
An extensive database has been already developed for the assessment of the facility. The relevant data for the development of this project includes concentration levels for different chemicals in groundwater and in soil. Analyses have been performed on groundwater contaminants, but the data for the chemicals sorbed to the soil has not been evaluated yet. Therefore, the analysis performed for the term project is in the soil matrix contamination.
 


METHODOLOGY AND MOTIVATION

In order to analyze the methods, the database available from British Petroleum will be analyzed. A compound was selected among those for which data is available. Talking to the researches involved in the project, naphthalene was identified as a chemical of concern and that will be analyzed in this project. To do this, first the methods for representation of surfaces for the concentrations are analyzed in order to choose those that are likely to represent our study. After that, ArcView and Arc/Info are used for the analysis of naphthalene concentrations in the soil matrix for the data of British Petroleum. Finally conclusions are withdrawn from the analysis. This project seeks for providing a suitable tool and parameters for the analysis of other chemicals in the refinery.
 

Some of the relevant characteristics of Naphthalene are displayed here to understand the concern of the presence of that contaminant from the point of view of regulatory aspects and the environment and human health protection .

Naphthalene is a polycyclic aromatic hydrocarbon represented by the formula C10H8. and its substituted derivatives are commonly found in crude oil and oil products. It is considered a listed hazardous waste by EPA (U165- Toxic Waste), and therefore it must be treated as such.
Naphthalene is not considered a carcinogen, however it attacks different organs of human organisms, including the nervous system. Exposure pathways are  inhalation, ingestion, adsorption, eye and skin contact. Some of the effects caused by the different exposures are harm to kidneys and blood system; irritation of skin, respiratory tract and eyes; ingestion may cause nausea, vomiting, headaches, dizziness and gastrointestinal irritation
 
The exposure limits values for naphthalene are:
 

Exposure limits

Value in ppm (dissolved)

Value in ug/Kg (sorbed in soil)

THRESHOLD LIMIT VALUE (TLV/TWA)

10 

26,000

SHORT-TERM EXPOSURE LIMIT (STEL)

15

39,000

PERMISSIBLE EXPOSURE LIMIT (PEL)

10

26,000

Table 1. Exposure limits of naphthalene

 


 SURFACE REPRESENTATION OF

CONCENTRATIONS OF COC

When sampling the soil, the data is always collected on discrete points. It is impossible to sample for every inch of soil present at a location in order to estimate the contamination at the site. When making the assessment of the contamination of a site it is desirable to know the continuous distribution of the contamination on the soil, rather than just the point of contamination where the samples were taken. Therefore, an approach to obtain a continuous representation of the contamination of the site is the calculation of  different surfaces on the plane in which the concentration of the chemicals is the same. The result is then a set of areas each one with a chemical concentration different to those adjacent to it.

When sampling, we obtain a set of different points for which an analysis was made. The graphical interpretation of this analysis in Arc View consists in a point coverage. In order to calculate the continuous presence of the chemical we need to transform the point coverage to an area or to several lines representing areas. That is, a map layer is produced either in a vector or raster mode, that can be visualized, analyzed and modeled as a semi continuous or continuous piece wise surface.
 

Transformation of points to surfaces

There are several ways to do this in GIS and that is represented in Table 1.

    a) Contouring.
        Consists in joining points of equal value with iso-lines. In order to be able to make contouring
        lines, we need to interpolate the sample points to a grid. The result will not be an area itself,
        a polygon is not actually defined by this procedure. However, spaces between contour lines
        are formed and those can be interpreted as areas.
    b) Thiessen Polygons.
        Consists in the generation of polygons from irregularly distributed points. They are also
        known by the name of Voronoi Polygons. This method results in the truly representation of
        areas on the plane.

 

FROM / TO

Grid

Lines

Areas

Points

Interpolation

Contouring

Thiessen Polygons

Table 2. GIS transformations for the surface analysis.

 

Criteria for the selection of the method of transformation

The suitability of conversion using one or the other methods depends on two things: the type of point and the type of measurement.

    a) Type of point:
            Natural spatial objects.- For example sinkholes in karst topography or lineament
            intersections. The appropriate transformation for this kind of spatial objects is Density
            Maps.
            Samples from a continuous or discontinuous field.- This are artificially created points in
            which an analysis is being performed. The appropriate transformation consist in a Digital
            Model of the Surface.

    b) Type of measurement:
            Categoric attribute.- For example a number arbitrarily assigned to a rock type. For this
            cases the use of Discontinuous Field is recommended.
            Sample attributes measured on ordinal interval or ratio scales.- These are all the
            geophysical and geochemical properties that are numerically measured in an analysis. For
            this kind of measurements a Continuous o Semi continuous Field is recommended.
 
For our study, we are talking about a Digital Model of the Surface, because our sample was taken from an artificially created point and we are measuring a geochemical property in the soil matrix. Therefore, the analysis would be based in the selection of this kind of modeling. The other models are explained briefly.
 

A. Density Maps

As stated above, density maps are suitable for those points which represent the measurement of natural features. The goal is to create a digital model of a surface, but the characteristic measured during sampling is not the value of a thematic point attribute, but is the number of points per unit area. It is very similar to a frequency graph but in this case the distribution of the data is represented over a geographical plane.
 

B. Digital Model for Surface representation

We have two forms to model point samples: non-interpolative models and interpolative models. Both models and different methods to develop them are presented here.

(1) Non-interpolative methods
These methods are used when we can assume that one or more attributes of the point can be assigned to a polygon. The objective is to create polygons on the plane in which the samples where taken. To accomplish this objective several non-interpolative methods can be used, among them Cell Method, Zone of Influence method and Thiessen Polygon are described here.
 

1.1 CELL METHOD

In the simplest case, each point is associated with a cell of a grid. Cells that contain no points are given null attributes, and the attributes of cells that contain more than one point are determined by aggregating attributes according to some rule, such as taking the mean, median, mode, maximum or minimum, depending on the value being analyzed. Each grid will form a different area with the attribute value of the point assigned to it. This concept is illustrated in figure 1.

Figure 1. Non-interpolative Method 1

 

1.2  ZONE OF INFLUENCE

When the zone of influence of the property being measured can be estimated, a good areal approximation is achieved by drawing a circle around the samples taken with a diameter equal to the length of the zone of influence. A criteria must be used for those circles which overlap. Figure 2 describes this method.

Fig. 2 Non-interpolative method 2

 

1.3 THE THIESSEN POLYGON

This is a method that overcomes the problem of polygons either having no points or more than one point assigned. An estimation of the concentration of all the points is made assuming that the closest points to the sampling location in relation to other sampling locations, have the same concentration of the former. This is done for all the points at the site until we find a continuous surface distribution of the chemical of concern or what is referred here as an areal concentration. The Thiessen polygon does not result in regions with null attributes, because the plane is divided in n parts, where n is the number of samples we have. The size of the polygons are inversely proportional with point or sample density.

The Thiessen Polygon is one of the methods used for the calculation of surface concentrations of COC in this project. Therefore, a great emphasis is done in explaining the form in which it works so it can be fully understood and also further applied to other modeling problems. First, let us consider how the polygons look after the Thiessen method is applied, latter we will look at the mathematical relation for deeper understanding.
 

    1.3.1  Graphical Representation of the Thiessen Polygon

As an example, consider a hypothetical site in which a sampling is made. We have different sampling points (P1, P2,..., P8) , and the corresponding concentration for the chemical of concern found at this points (C1, C2, ..., C8). This is the discrete distribution of the contamination at the site and it is represented in figure 3.
 

Fig. 3 Sampling points for chemical of Concern's concentration

 

After calculation of the nearest points to the different sampling points, an areal distribution can be obtained. In our example, the representation would give us the site divided in different areas (A1, A2,....,A8), each area with a chemical concentration equal to C1, C2,......, C8, respectively. In figure 4 we can see how the Thiessen Polygon looks like.
 

Fig. 4 Areal Concentration for Chemical of Concern.

 

    1.3.2 Mathematical interpretation of the Thiessen Polygon.

The boundaries of the polygons are formed by tracing perpendicular bisectors of the lines joining adjacent points in what can be called the sample network. A polygon is formed when the bisectors from different samples intersect, forming the nodes of the polygon. Consider a given territory from which five samples were taken. The first step consist in forming a network joining the five points together. Next, the polygons inside the area of analysis are formed by tracing a perpendicular line bisecting each arc of the network. The bisectors are traced and intersected with each other forming the polygons. The procedure is schematized in Figure 5.

Figure 5. (a) Network joining sample points. (b) Calculation of areas by tracing the

perpendicular bisectors of arcs' network.

 
With this method an average concentration of the whole area of analysis can be found by given a "weight" to each surface concentration. The weight that is appropriate is the area of each polygon. If there are j points (P1, P2, P3, P4, for our example) and the area for each point is defined as Aj (A1, A2, A3, A4) and Cj is the concentration recorded at each sample point j, the areal average concentration (C) of the COC in the area of analysis is:

 

This approach is more useful in analyzing other parameters rather than concentration. The typical data that is sometimes analyzed using this approach is precipitation data, where a precipitation average can be found for a certain region.

      1.3.3 Weakness of Thiessen Polygon

The biggest disadvantage of using Thiessen Polygons is that polygon size varies inversely with point density. This leads to possible situations where a region is assigned the attributes of a point that is far away beyond any reasonable zone of influence when no other point is closer. The way to circumventing this is to combine Thiessen with the zone of influence method, tracing circles of diameter's size equal to the zone of influence estimated.

Another weakness is that, every time that a new sample is taken at other location a new network has to be traced and a new set of Thiessen Polygons have to be calculated.

 

(2) Interpolative methods.

The interpolation process involves estimating the value of the modeled variable at a succession of point locations, usually on a square lattice. This is the process of gridding and the gridded values are then treated as the pixels of a raster image. Contour lines are threaded through the pixels or cells with identical values and the data is represented in a way similar to a vector structure in which the polygons are represented for the boundaries formed by the contours. An alternative interpolative method is the representation of point data by triangular irregular networks (TIN).

Several interpolative methods have been proposed among which Triangulation, Distance Weighting, Kriging and Spline Methods can be mentioned. The interpolative method examined for this project will be the Inverse Distance Weighting Method and the Spline Method which are already available form Arc View and had been used in previous exercises throughout the course. Special emphasis is made on the mathematical understanding of this methods and its parameters in order to make an efficient use of them in Arc View.
 

2.1 INVERSE DISTANCE WEIGHTING METHOD (IDW)

This method uses weighted moving average points within some zone of influence for the calculation of the grid values trough the area of analysis. The weights are inversely proportional to the square of the distance form the center of the zone of influence. The zone of influence for this method is defined as a circumference.
 

    2.1.1 Graphical Representation of IDW

To understand how this method work we have to define certain parameters:

    (1) Estimation point: is the point for which a value is going to be interpolated from the samples
          taken. The estimation point is always located in the center of the zone of influence and it is
          designated by the subscript "o".
    (2) Sample points: are the actual measurements that we have for the study. We have a total of "i"
          samples in the analysis.
    (3) Surface height: is the value to be assigned to the estimation point and it is designated by Co.
    (4) Neighbors: they are the sample points which fall within the zone of influence. There will be
          a total of "j" neighbors.

The relation between these parameters are illustrated in figure 6

Fig. 6 Parameters for IDW

Each sample point has a value (chemical concentration) found during the sampling, designed by Ci, where i is the number of samples taken for the analysis; and also a location on the plane at certain distance from the estimation point (dio). The value which wants to be calculated is that for the estimation point (Co) and that is going to be designated according to the values of the samples within its zone of influence (neighbors: Cj) and the distances of this points from the estimation point (djo).
 
 

    2.1.2 Mathematical Representation of IDW

The mathematical interpretation of the IDW is that the values assigned for the estimation points are going to be determined by the values of the neighbors, each of them weighted according to the inverse of the distance between the estimation point and the neighbor being considered, elevated to the second power. The relation is expressed in the below formula. The summation in the denominator is just used to normalize the function (to assign an average value to Co).
 

 Where Wj = 1/(djo)2

And    Co : Value calculated for the estimation point
          djo: Distance from the estimation point to the j neighbor.
          Wj: is the weight given to each sample obtained by the inverse of square the
                distance between the estimation point and the j neighbor.
           The subscript o refers to the estimation point  (sample)
           The subscript j refers to the sampling points falling within the zone of influence (neighbors)

 
    2.1.3 Definition of parameters for IDW in Arc View.

All parameters are defined when applying this function in Arc View. Looking at the dialog box appearing when choosing to use this function in Arc View (Figure 7) we can know how to use it properly.

Figure 7. IDW Parameters

    (1) Z Value Field: refers to the value that wants to be analyzed. In our case of study will be the
         concentration of the chemical. You can scroll down this box and will find the name for this
         variable.
    (2) Next you can choose either to assign a specific number of neighbors or a fixed radius.
         That means that you have certain flexibility to analyze the zone of influence and either give it
         a specific size or decide how many point samples to include into the zone of influence. When
         radius is chosen the box number of neighbors in the above screen changes for Radius, here
         you can type the value for your radius in map units.
    (3) Power: Arc View gives you the flexibility to use the inverse distance raised to a power that
         you choose. The greater the number you assign the greater the decay of the weighting
        function, in other words, the less the effect that far neighbors would have on the value of Co.
    (4) Barriers: Arc View gives you the choice to define a given boundary for the analysis. The
         the boundary could be any other polygon on your coverage.
 

    2.1.4 Strengths and weakness of the IDW method.

The strengths and weakness of the IDW are related with the parameters involved in the definition of the model. Even though, they give you a great amount of flexibility, it also can be a weakness if there is not a deep understanding of the behavior of the variable being measured and, therefore those parameters can not be assigned properly. The best that can be do when there is not certainty of which parameters to use is to try with different ones and try to logically analyze which of the situations is actually more likely to be occurring. However, a deep understanding of the value being measured and the variables affecting them is always desirable to reduce every source of uncertainty in the analysis.

2.2 SPLINE METHOD
 
The Spline Method uses a polynomial expression for an analytic surface that passes through all the sample points. Its objective is the interpolation of values for each grid in the raster model such as to produce a surface in which the slope at all points is obtained by minimizing the total curvature. In other words, it searches to minimize the mean squared sum of second derivatives. The general idea is to find the point that is closer to the point being interpolated such as the slope formed by the surface produced is minimized. The global result of the interpolation is a surface that rather than bend sharply, will flex in a wide curve. This method was originally developed for interpolating wing
deflections in aircraft design.
 

   2.2.1 Mathematical Representation of the Spline Method

For N data, N+3 simultaneous equations are set up. This system establishes any linear trend in the data and builds the minimum curvature surface (second derivative of the polynomial) which is added to the surface. The following set up represents the simultaneous equations are for four data points to generate a spline curvature.

Here C(Pi - Pj) is a function of distance between point i and j in the data set for all i and for all j forming the set. This function is usually expressed as the (d)^2*logd, where d represents the distance in the x, y plane between points i and j. The unknown values ai and bj, are obtained by solving the system of equations. These are the coefficients of the surface for an arbitrary location, and the sum of ai is equal to zero. For any interpolation point X(x,y) the interpolated value is:

F(x,y) = b0 +b1x + b2y + a1C(P1-X) + a2C(p2-x) + a3C(P3-X) + a4C(P4-X)

 

    2.2.2 Parameters for the Spline Method in ArcView

The Spline Method has three different parameters (Figure 8): Number of Points, Type of Surface and Weight. The Number of Points parameter is used for computational purposes. The entire space of the output grid is divided into blocks or regions equal in size. The number of regions is determined by dividing the total amount of points (the number of samples)  by the value specified for this parameter.

The Type of surface interpolation by spline method can be Regularized or Tension. Generally speaking, the Regularized type produces smooth surface than those surfaces created by the Tension type. The Regularized type incorporates the third derivative into the minimization criteria ensuring a smooth surface together with smooth first derivative surfaces. For the Regularized Method the Weight parameter specifies the weight attached to the third derivative terms during minimization. Values between 0 and 0.5 for Weight are suitable for the Regularized type, the higher the number, the smoother the surface.
The Tension type of interpolation modifies the minimization criterion so that the first derivative terms are incorporated into the minimization criteria. The Weight parameter for the Tension interpolation specifies the weighs attached to the first derivative terms during minimization. A weight of 0 results in a thin plate of Spline interpolation. Large  values for the Weight in the Tension  results in coarse surfaces, but surfaces that closely conform the sample points. The values for the Weight in the Tension Function are typically 0 , 1, 5 and 10.

Figure 8. Spline Method. Types and Parameters

 2.2.3 Strength and Weakness of the Spline method.

This method can be applied to most data sets, but there is some difficulty if the range of the values being interpolated is large. This can be explained by considering the basic conditions that this method complies. Remember that the interpolated surface must pass through all the points in the data set. So, if we have great variation among those points, the smoothness of the curvature is difficult to achieve, that is, the second condition is almost impossible to accomplish (smooth curvature).

The strength of the method is that we have a great flexibility than any other interpolation methods discussed in this paper. Although, deep understanding of the interpolation method is required to achieve this advantage.
 
 



 

PROCEDURE

 
A. Available data

B. Requirements

To run this procedure, you need the following software ArcView version 3.0, Microsoft Excel 97, Microsoft Access 97, 32 bit ODBC, Import71, and Arc/Info in Unix. The files required for this exercise can be obtained from the LRC server at lrc/class/maidment/griswr/riskmap. Or via anonymous ftp form ftp.crwr.utexas.edu/pub/gisclass/riskmap. You would need the following files:

The file bound.e00 is in export format and needs to be imported by the Import71 command in the PC's or by the arc import command in the unix machines.

C. Selection of the data

Talking with the researchers involved in the project for the remediation of British Petroleum, it was determined that for soil analysis  BTEX can be defined as COC because they are found in high concentrations at the area of analysis. This investigation makes an analysis of naphthalene with the objective of determining which interpolation tool is suitable to the data for Marcus Hook Refinery.
 

D. Creating Queries in Access and exporting to Excel.

The selection of the data form the extensive database of British Petroleum bp-mhr.mdb was made with queries on Access software. The description of the general procedure to make queries in Access was extensively described on Exercise 7: Mapping Environmental Data Stored in Microsoft Access.  The criteria used for the specific case of this project is described here:

Now you have the required data to analyze naphthalene concentrations in the soil matrix for Marcus Hook Refinery. However, they are still expressed in different units, that is why we exported it to Excel. From there we will make the corrections and will reimport the table to the Access database.
 

E. Making units correction in Excel.

F. Importing the new table to Access and creating the final query.

Now we have the units of measurement in the right form, we need to import this new table to Access. To do this:

What we have left is to create the query that would generate the table to be read by the Script.

 G. Using new data in ArcView to create a Point Coverage

We are going to use the Script file named cocvalue.avl to create a Point Coverage to locate the places where the concentration measurements were made. The Script file was developed by Andrew Romanek the Script "reads" the easting and northing locations of our samples from the query created in the database in Access to crate a point coverage. Such coverage contains a database with the field VALUE, which represents the concentration of naphthalene in ug/kg in the soil matrix and which will be analyzed.

H. Making Surface Representation of COC Concentrations.

As mentioned on the theory for Transformations from Points to Areas, a better evaluation and a risk assessment of the site can be realized if instead of points, we have a graphical representation of the areal spread of the contaminant. To do that for the project I have chosen to run the Thiessen Method for the non-interpolative method; and the Inverse Distance Weight and Spline method for the interpolative method. Comparisons are made to choose the best parameters to each method and the procedures are explained next.
 
Non-interpolative - The Thiessen Method
This part of the procedure has to be run in Arc/Info. If you are using a PC you will have to transfer the coverage naphthalene created in step G to a UNIX machine where Arc/Info can be used. You can do this using ftp (make sure you transfer the three files naphthalene.shp, naphthalene.shx and naphthalene.dbf).
 

                    Arc: shapearc naphthalene nap

           We are converting the point coverage naphthalene created during step G to a shape file
           called nap that can be read in Arc/Info.
 

                    Arc: thiessen nap nappoly
                    Loading points from coverage nap...
                    Triangulating...
                    Creating Thiessen Structure...
                    Constructing arc/polygon Topology...
                    Generating polygon attributes...

        We have just created the Thiessen Polygon for our analysis
 

                     Arc: export cover nappoly nappoly

           We just created  nappoly.e00 which can be easily transferred from the UNIX
            machine to the PC by ftp. Do this in order to proceed with the exercise.
 

 

 Figure 9. Import 71 Utility for transferring *.e00 files

Interpolative method - Inverse Distance Weighting Method.

 Figure 10. Creating a color pallet for surfaces in the Legend Editor

Spline Method

 

I. Statistical analysis.

From ArcView we can analyze the statistical behavior of naphthalene contamination according to the sample points analyzed during this procedure.

 Figure 11. Statistical analysis for concentrations of naphthalene.

 

This is the end of the procedure, in the next section the graphical representation of what was created while following it is displayed and analyzed.


RESULTS AND CONCLUSIONS

Thiessen Method

In figure 12 we can see the display of the polygons formed during the Thiessen Method.
 

Figure 12. Thiessen Polygons for the concentration of Naphthalene

The darker the color, the greater the concentration and, therefore, the greater the concern. We can see "hot spots" distributed almost in the center of the facility. This area corresponds to storage tanks where relatively high concentrations of naphthalene were found during sampling. The Thiessen Method gives a good representation, but if the data is scattered it produces polygons covering large areas that are hardly good representations of the spread of the contamination. We can see small polygons at the north-east of the facility which are probably good representations. However, the data was quite scattered at the north-west giving big polygons which may be not as representative as the small ones.  However, it represents a quite useful method to estimate the areas of concern at the facility.

To understand the following analysis it is important to understand what the colors in the maps mean. Therefore a representation of the legend used for all the maps was unified and is presented in the figure13.

 Figure 13. Color palette used for the analysis

of the IDW and Spline Method.

Note that all the values are expressed in ug/Kg and keep in mind that the exposure concentrations of naphthalene considered of concern are all those above 26,000 ug/Kg. Also note that the white value does not necessarily mean zero concentration, it could be any concentrations bellow 5,000 ug/Kg.

Inverse Distance Weighting Method.

First let us examine the effect of using a different value for Power parameter in the IDW method (Figure 14).

 Figure 14. IDW results varying the power parameter

 

We can see that as we increase the power parameter, we have greater concentrations in the area. Actually, if you increase the power to 4, you have all the area covered with red, pretty high concentrations! Why is that? Well, remember that the greater the number you assign the greater the decay of the weighting function, in other words, the less the effect that far neighbors would have on the value of Co. So, for the point of maximum concentration in the graphs (the red area in the two at the left) got higher concentrations because only the near points are having a significant influence, and this near points have pretty high concentrations. So, if we want to be conservative the author suggests to use power 2 for this analysis.

Now, lets examine the effect of using different Number of neighbors for the IDW Method (Figure 15).

 Figure 15. IDW results varying the number of neighbors  parameter

We can see that, as we decrease the number of neighbors we get higher concentrations. The reason of this is pretty much the same explanation as that for varying the power. As you decrease the number of neighbors, that means that fewer sample points have any influence in the interpolated grid. Therefore we find again a high concentration where the hot spots were measured (around the red area). In our study we have 97 data points, but, a consideration that was taken when choosing the possible number of neighbors was the large spread of this points. You can see that around a certain area you could have around 15 relative near neighbors, all the rest are far from the area. Therefore, the opinion of the author is that between 12 to 20 neighbors should be considered for the analysis.

We can see that all the figures gave us pretty coherent results. We find "hot spots" around the same areas found during the Thiessen analysis, but we have areas that are more likely to occur in reality with the IDW method. Therefore IDW Method is considered a suitable tool for the analysis of Marcus Hook Refinery.

Spline Method.

In figure 16 we can compare the influence of different Number of Points in the Regularized type.

 Figure 16. Spline-Regularized results varying the number of points parameter

We can observe that the change in the number of points parameter does not affect much the results of our analysis. We can see that we get large red areas (hot spots), actually if we kept increasing the number of points we would get the facility all painted in red. This approach identifies greater areas of concern, also in greater concentrations. We will examine this further to check what does this mean.

Now, lets observe the influence of the weight in the Regularized Spline Method in figure 17.

 Figure 17. Spline-Regularized results varying the weight  parameter

According to theory, values of Weight between 0 and 0.5 for the Regularized Method should be applied. Using different weight values, we see that there is not significant difference among the results. We can see again great red spots, this question as said above will be discussed later when we finish analyzing the possible parameters of analysis for the Spline Method.
 

Next, lets analyze the Tension method. First lets see how variation in the number of points produces effect on the surface calculated (Figure 18).

 Figure 18.  Spline_Tension results varying the number of points  parameter

We can observe again the red spots and almost no change when varying the number of points in the analysis.

Finally, lets examine the change in the weight for the Tension Spline Method showed in figure 19.

 Figure 19. Spline-Tension  varying the weight  parameter

We observe quite a variability when the weight is modified in the Tension Spline Method. According to theory, higher values of weight give coarse surfaces. Therefore, weight 1 represents a coarse surface compared with those of 0 and 0.5.

In general, when applying Spline Method we can observe a clear repeating pattern for the Spline Interpolation Method with hot spots of contamination located in the middle part of Marcus Hook Refinery's facility. The reason why the Spline interpolation results in high concentrations located between the subsets of data measured is because of the interpolation technique used to obtain the curve. Because you have two subsets of data with concentrations Ci and you have a relative large area of territory without samples, in order to make a smooth surface you end up with high concentrations near those of the subsets in the spot without measurements (See Figure 20). The spot surely decreases its concentration as you approach to the center of the spot and, thus, farther from the subsets as displayed in figure X. We can see that with the contour lines. Therefore, to the author's opinion, in order to have a representative interpolation of a surface by Spline Method, its necessary to make the sample analysis following a more or less spatially even technique for sampling. Maybe the error was made by the author when eliminating those sampling points that had concentration values below detection limits. If those points were included probably the Spline Method would give a better result. A reasonable and conservative approach would be to assign to those points with values below detection limits the concentration of that detection limit.

However, the spatially even technique for sampling was not performed in the analysis of Marcus Hook Refinery for practical reasons in which the sampling was made in places where contamination was expected to occur. It does not matter that common sense would tell us that there would be near zero concentrations around places, if Spline Method is to be used, we should measure even places where contamination is not likely to exist or where it is low. It would be preferred to have samples with zero value and the resulting surface would more closely represent reality.  Another way to circumvent the problem would be by dividing the analysis in subsets in which there exist samples spatially uniformly distributed. Also, the other problem faced by Spline Method is the highly variability of the data set of naphthalene concentration measurements, having a maximum value of 400,000 ug/Kg, a minimum of 2.6 ug/Kg and a standard deviation of 56,978.3 ug/Kg.
 

 Figure 20. Spline Method. High spots in the central part of the facility, right between the

subset of sample points' subset. (Regularized surface, No points =12 and Weight = 0.5)

In conclusion, Spline Method was not suitable for the analysis of Naphthalene concentrations in the soil matrix for Marcus Hook Refinery data. If this method wants to be used, two conditions have to exist: the data has to be more or less uniformly distributed, and the variation of the value modeled should be small.
 

The general conclusions that can be withdrawn from the analysis are listed next:


FUTURE RESEARCH

 


DATA DICTIONARY

 

Data

Feature

Class

Attribute

Value

Description

boundary

Refinery's boundary. State Plane Coordinates (South Pennsylvania)

Arc

-

-

Delineation of Marcus Hook's Refinery borderline.

bpoil.tif

USGS map 

Image

-

-

State Plane Coordinates (South Pennsylvania)

naphthalene

Point Coverage

Point

Easting, Northing, Value

-

Sampling points of naphthalene concentrations at and around the facility

nappoly

Thiessen polygon

Polygon

Value

-

Coverage created during Thiessen Polygon Method application

IDW_12_05

24 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using IDW. Number of neighbors =12 power = 0.5

IDW_12_2

24 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using IDW. Number of  neighbors =12 power = 2

IDW_12_3

24 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using IDW. Number of  neighbors =12 power = 3

IDW_18_2

24 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using IDW. Number of  neighbors =18 power = 2

IDW_6_2

24 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using IDW. Number of  neighbors =6 power = 2

Spline_reg05_10

22 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using Spline-Regularized No. of point =10   Weight = 0.5

Spline_reg05_12

22 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using Spline-Regularized No. of point =12   Weight = 0.5

Spline_reg05_25

22 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using Spline-Regularized No. of point =25   Weight = 0.5

Spline_reg0_12

22 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using Spline-Regularized No. of point =12   Weight = 0

Spline_reg01_12

22 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using Spline-Regularized No. of point =12   Weight = 0.1

Spline_ten1_12

24 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using Spline-Regularized No. of point =12   Weight = 1

Spline_ten1_15

24 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using Spline-Tension No. of point =15   Weight = 1

Spline_ten1_25

24 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using Spline-Tension No. of point =25   Weight = 1

Spline_ten0_12

24 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using Spline-Tension No. of point =12   Weight = 0

Spline_ten5_12

24 ft cell

Grid

Value

Floating

Naphthalene concentration surface representation using Spline-Tension No. of point =12   Weight = 5

 


REFERENCES

[1] Berry, J.K. Spatial Reasoning for Effective GIS. GIS World Books. USA 1995
[2] Bonham-Carter, G.F. Geographic Information Systems for Geoscientists. Modeling with GIS. Pergamon Editorial. Canada, 1994.
[3] Hoggan, D.H. Computer Assisted Floodplain Hydrology and Hydraulics. McGraw Hill. USA 1997
[4] Linsley, R.K.; Franzini, J.B. Water Resources Engineering. McGraw Hill. USA 1972
[5] Environmental System Research Institute. Understanding GIS, the Arc/Info Method. USA 1992.
[6] McCarthy,D.F, Essentials of soil mechanics and Foundations. Prentice Hall Editorial
[7] Watson, D.F., Contouring, A guide to the Analysis of Spatial Data.Pergamon Editorial. USA, 1994.

 


 
 
 
 

Go back to:

Aiza's home page
Civil Engineering home page