A three-dimensional geostatistical model for estimating salinity in Corpus Christi Bay

 

(courtesy of  http://www.scenerysolutions.com/UT/CorpusChristiBay.jpg)

 

by Ernest Sin Chit To

Term Project

Spring 2007

CE 394K.2 Surface Water Hydrology

 

Contents

 

Contents

Background

Goals for three-dimensional kriging of salinity

Data

Period of study

Datacube diagrams

Three-dimensional diagrams

Approach

Kriging

Normality Tests

Variogram modeling

Execution of Kriging

Visualization

Results

Interpretation

Conclusion

Space-Time Integration

Goals accomplished

Next Steps

References

Background

Hypoxia is a common phenomenon found in estuarine systems.  It occurs when dissolved oxygen is depleted below ~2 mg/L – thereby stressing resident aquatic species and benthic organisms. In Corpus Christi Bay, hypoxia is often correlated with the occurrence of salinity-induced density stratification of the water column.  Typically this happens in summer when temperature and evaporation are high and precipitation is low (Ritter and Montagna, 1999).  Density stratification is also suspected to be enhanced by existing hypersaline water inflows to Corpus Christi Bay from adjacent waters, such as Laguna Madre and Oso Bay (David and Hodges, 2006). 

 

The physical causes of hypoxia have yet to be clearly identified. As such two scientific communities, CUAHSI (Consortium of Universities for Advancement of Hydrologic Science, Inc) and CLEANER (Collaborative Large-Scale Engineering Analysis Network for Engineering Research), have assembled a multi-disciplinary team of environmental engineers, biologists and computer scientists to investigate the problem.  Furthermore, environmental sensors in the region have been identified (see Figure 1) and informatic tools, such as webservices, are being built to access their data.

 

 

Figure 1.  Extent of episodic hypoxia (yellow and orange regions) in Corpus Christi Bay.

 

Goals for three-dimensional kriging of salinity

The goal of this project is to look into the salinity aspect of the hypoxia problem.  A geostatistical model (i.e. kriging) has been created to interpolate salinity values and their associated uncertainties at different locations and times in Corpus Christi Bay.  The intention down the road is to construct a spatially and temporally continuous picture of the salinity behavior of the bay, which can provide insights into the hypoxia problem.

 

Data

The geostatistical interpolation was performed on salinity data that were collected by Dr. Paul Montagna (Texas A&M University, Corpus Christi).  The data spans 1987 to 2006 and is a collection of grab samples gathered from 50 stations in the bay.  All the salinity data are listed in Appendix I (link). 

 

Period of study

For the sake of demonstration, this project is only focused on the month of August 2005.   This is an interesting period because the salinity data shows the bay starting in a stratified state at the beginning of the month (8/2/2005).  By the middle of the month (8/16/2005) the salinity gradient along the depth has increased.  However, on 8/23/2005 the salinity in the bay suddenly became uniform both laterally and in depth.  Finally, on 8/30/2005, although still uniform in depth, the bay is no longer uniform laterally and demonstrated a salinity gradient along the NE-SW direction. 

 

The above-mentioned phenomenon is illustrated in the animations in Figures 1 and 2. 

 

Datacube diagrams

Figure 1 is composed of four sets of datacube diagrams (one set for each of the sampling dates 8/2, 8/16, 8/23, and 8/30) stitched together into an animated sequence.  Datacube diagrams were devised in this project to summarize salinity profiles along the four dimensions – x (longitude), y (latitude), z (elevation) and t (time).  The four graphs are arranged around a map of the sampling area with the axes of longitudinal and latitudinal profiles aligned directly with the projection of the map.  This allows data on the graphs to be referenced directly to map features, such as sampling stations.  For instance, data collected from station 20 on the map (see Figure 1) can be traced to the latitudinal and longitudinal graphs via the two pink lines that represent the longitude and latitude of the station.

 

Each graph shows the marginal profile of the samples along one of the dimensions.  Not only does the profile show the mean value of samples collected at a given point on the axis, it also shows the 10th, 25th, 50th, 75th and 90th percentile values of those samples.  As a result, datacube diagrams can illustrate some of the probabilistic behavior of the data. 

 

 

Figure 2.  Animation of datacube diagrams showing salinity profiles along time (t), latitude (y), longitude (x) and elevation (z) axes.

 

The “z vs. Salinity” graph provides the most information about the stratification-mixing phenomenon in the bay.  The effect of the navigational channel can also be observed in the “x vs. Salinity” and “Salinity vs. y” graphs. With the exception of 8/23/2005 the salinity in the north east corner of the map tends to be less than the southwest corner of the map.

 

Stills of the animation in Figure 2 are available in Appendix II (link).

 

Three-dimensional diagrams

Figure 2 is an animation sequence composed of four sets of three-dimension diagrams of salinity measurements in the bay.  The diagrams were created using ESRI’s ArcScene software. These diagrams are another visualization of the stratification-mixing phenomenon and reinforce what is portrayed in the datacube diagrams.

 

Figure 3.  Animation of 3D diagrams of salinity in Corpus Christi Bay

 

 

Stills of the animation in Figure 3 are available in Appendix III (link).

 

Approach

Kriging

 

Geostatistical interpolation was performed using ordinary kriging in three dimensions.  Kriging is an estimation method that generates value and error estimates by using data values and their spatial configuration as inputs.  Similar to the linear least squares method, it is also a best linear unbiased estimator (BLUE).  However the underlying theory of kriging depends on two main assumptions, which are:

 

  • Mean and variance of the data are invariant with translation, i.e. stationarity.
  • Distribution of the data is Gaussian, i.e. normality.

 

Advanced kriging methods, such as indicator kriging and universal kriging, can deal with non-normality and non-stationarity.  Because of the limited timeframe of this study, it is assumed that the mean of salinity in the bay was uniform within each given day.  In the follow up work to this project, this assumption will be removed so as to account for spatial trends in mean.  Doing this would also provide an opportunity to incorporate results from deterministic hydrodynamic models into kriging as trend models.

 

The derivation of ordinary kriging is not shown in this report as there are many textbooks dealing with this subject.  The interested reader is advised to refer to authoritative textbooks such as Isaaks and Srivastava's Introduction to Applied Geostatistics.

 

Normality Tests

 

The salinity data was inspected for normality by plotting probability density functions (PDF) and cumulative density functions (CDF).  To generate a point of reference, the data was normalized to the best extent possible using the Box-Cox transformation (Box and Cox 1964).  Then the density function of the untransformed case was compared with the transformed case to decide whether transformation was necessary.  This exercise was performed not only for the entire data sets (see Figure 4) but also for subsets of the data to ensure multi-normality of the data.  Data were subsetted in the following manner:

  1. By sampling stations
  2. By sampling depth
  3. By sampling date

Figure 4.  Animation of 3D diagrams of salinity in Corpus Christi Bay

 

The results of the normality tests for the subsets are available in Appendix IV (link).

 

Since the untransformed data in general was reasonably normal and given the limited time frame of this project, it was decided that the kriging would be performed on untransformed data.

 

Variogram modeling

 

Variograms are crucial to the kriging process because they establish the spatial relationships among data locations.  An experimental variogram shows the semivariance between all possible pairs of data that are separated by a given distance, i.e. lag.  Semivariance is calculated by the formula:

 

 

Where  is the semivariance at lag h.

is the separation distance between the data pairs located at  u and u+h.

is the data value of the first member of the pair.

is the data value of the second member of the pair.

is the location of the first member of the pair

is the location of the second member of the pair

 

Essentially the semivariogram is a measure of how different two data points become as they are moved further and further apart.  In geostatistical interpolation, a mathematical model is fitted through the experimental variogram which is then used to compute data covariances..

 

In this project, a set of experimental variograms was plotted for each of the four days. To account for the different spatial correlation for the three dimensions, three variograms are plotted for each day, namely:

 

1.                  A variogram along the direction of maximum correlation (i.e. the major axis)

2.                  A variogram along the direction of minimum correlation (i.e. the minor axis)

3.                  A variogram along the vertical direction

 

The experimental variograms were fitted by eye using professional judgment.  Due to time constraints, a very simple method was used. Each variogram was assumed to contain one structure only and a spherical model was used to fit the particular structure.   In the subsequent work, more sophisticated methods will be used to identify and model the multiple structures within the variograms.  Figure 5 shows an example of the experimental and fitted variograms in the major direction, minor direction and dip direction for August 2, 2005.

 

Figure 5. Experimental and fitted variograms for August 2, 2005.

 

To see the rest of the variograms please refer to Appendix V (link). 

 

A summary of the fitted variogram parameters are listed below:

 

Table 1.  Summary of parameters for variogram models

Date

8/2/2005

8/16/2005

8/23/2005

8/30/2005

Type of Model

Spherical

Spherical

Spherical

Spherical

Nugget

0

0

0

0

Sill

6 (psu2)

6 (psu2)

1.5 (psu2)

6 (psu2)

Azimuth of major range

60

60

120

60

Azimuth of minor range

150

150

30

150

Dip angle

0

0

0

0

Major range

15000 m

3000 m

7500 m

15000 m

Minor range

1800 m

1000 m

1500 m

3000 m

Vertical range

2 m

1.5 m

8 m

8 m

 

Execution of Kriging

 

The fundamental geostatistical interpolation and variogram operations were performed using the popular open-source geostatistical software GSLIB.  GSLIB stands for Geostatistical Library and is a set of tools developed by Dr. Clayton V. Deutsch of the University of Alberta and Dr. Andre Journel from Stanford University. It can be downloaded from http://www.gslib.com/.  Because the GSLIB interface is very basic and has little means of operating with other computer programs several programs were written in RSI’s IDL to manage the workflow of the overall geostatistical interpolation process.  The workflow process consists of the following tasks:

 

  1. Querying salinity data from Paul Montagna’s database
  2. Generate probability plots for normality testing.
  3. Write parameter files for the GAMV.exe and KT3D.exe to perform variogram analysis and 3D-Kriging.
  4. Call the GSLIB functions, GAMV.exe and KT3D.exe.
  5. Generate variogram plots.
  6. Visualize kriging results using voxels.

 

An illustration of the workflow is provided in Figure 6.  The interested user may contact the author at eto@mail.utexas.edu if he is interested in the details of the programs.

 

 

Figure 6. Workflow of computer programs for performing three-dimensional kriging

 

Visualization

Visualization of results was performed using voxels.  A voxel is a volume element that represents a value in a regular three-dimensional grid.  Like a pixel, the appearance of a voxel volume is control by four byte values: i.e. red (0 - 255), green (0-255), blue (0-255) and opacity (0-255).  Superimposing the four values creates the three-dimensional image.  The opacity value is especially important because it allows voxels on the surface of a given volume to be “turned off” so that underlying voxels can be visible to the viewer.  Voxels are widely used in the medical industry and for visualizing CAT (computed axial tomography) scans.  Figure 6 shows such an application.   The right diagram shows the voxels in the top left portion of the volume turned off so that underlying voxels are visible.

 

 

Figure 7.  Visualization of CAT scan results using voxels (data courtesy of RSI’s IDL 6.3 examples)

 

Results

Figures 8 to 12 show the result of the interpolation displayed using voxels.

 

Figure 8.  Kriging results for Aug 2, 2005.

 

Figure 9.  Kriging results for Aug 16, 2005.

 

Figure 10.  Kriging results for Aug 23, 2005.

 

Figure 11.  Kriging results for Aug 30, 2005.

 

Figure 12.  Kriging variances for Aug 2, 2005 to Aug 30, 2005.

 

Interpretation

Essentially, the interpolation results reaffirm what has been observed in the data.    The bay starts off stratified on Aug 2, 2005 and becomes more so by the middle of the month (8/16/2005).  However, on 8/23/2005 the bay becomes well-mixed and virtually uniform.  (This is analogous to someone taking a huge spoon into the “tank” (i.e. the bay) on 8/16/2005 and giving it a good stir). It is suspected that a storm or wind event was the driving force behind the mixing.  Finally, on 8/30/2005, although still uniform in depth, the bay begins to show a salinity gradient along the NE-SW direction. This was probably caused by the introduction of less saline waters from the gulf via the navigational channel and hypersaline water from Laguna Madre.  Although not shown in the previous map figures, the navigational channel runs roughly from Port Aransas to Ingleside to Texas State Aquarium (see Figure 1).  It is the main avenue for the exchange of bay and gulf waters.

 

Kriging variances also reflect the behavior of the data.  Kriging variances for 8/2/2005, 8/16/2005, and 8/30/2005 are higher than that of 8/23/2005.  This is because the range in salinity values is higher in the former (due to stratification) then in the latter.  It is not yet understood why the variances on 8/30/2005 are higher than those of 8/2/2005 and 8/16/2005, but the author is looking into the matter.

 

Figure 13.  Timeline of salinity kriging results

 

It is amazing how the salinity pattern in the bay becomes strikingly apparent once 3D interpolation has been performed on the data and the results visualized.  It is much easier to understand the phenomenon in the bay by looking at the voxels than by data points!  The experience is similar to a cartographer painstakingly drawing contour lines from surveying results and then suddenly seeing the hills and valleys in the terrain emerge from his map.  Truly amazing!

 

These kriging results also help identify the following knowledge gaps:

 

  • How did the stratification intensify between 8/2/2005 and 8/16/2005?  What caused it?
  • What caused the mixing between 8/16/2005 and 8/23/2005?
  • What caused the NE-SW salinity gradient to remerge again from between 8/23/2005 and 8/30/2005?

 

The author will attempt to answer these questions in the follow-up work to this project.

 

Future work: Space-Time Interpolation

In the follow-up work to this project the author will attempt to perform space-time interpolation in order to fill in the knowledge gaps identified.  Possible areas for improvement are:

 

  • Inclusion of extra sources of salinity data

This will expand the spatial and temporal domain of the geostatistical model.  More importantly, it will help fill in the data gaps by bringing in data from other dates.  Although some data were available on days that were not analyzed in this project, they were not of sufficient quantity to present a full picture.  Therefore they would need to be augmented by other data sources such as TCOON (Texas Coastal Ocean Observatory Network) and Dr. Ben Hodges (University of Texas at Austin) before kriging can be performed.

 

  • Inclusion of ancillary data (e.g. precipitation and wind)

Weather information will help explain the driving forces behind the mixing phenomenon observed on 8/16/2005.  In addition to providing insights, ancillary data can be integrated into kriging using a technique called co-kriging.

 

  • Inclusion of deterministic models results, e.g. hydrodynamic salinity models

As mentioned earlier, because of the limited timeframe of this study, it was assumed that the mean of salinity in the bay was uniform within each given day.  In the follow up work to this project, this assumption will be removed so as to account for spatial trends in mean.  Doing this would also provide an opportunity to incorporate results from deterministic hydrodynamic models into kriging as trend models.

 

  • Expanding on interpolation framework to perform space-time kriging

The 3D framework in this project can be expanded to perform kriging in four dimensions.  This will require some extra coding effort because GSLIB does not have the capability to perform four dimensional kriging yet.

 

  • Random field simulations

With the space-time behavior of salinity adequately characterized through deterministic and stochastic models, projections on salinity can be performed using random field simulations.  Techniques such as Sequential Gaussian Simulation (SGS) and LU simulation may be applied.

Conclusion

A geostatistical model (i.e. kriging) has been created to interpolate salinity values and their associated uncertainties at different locations and times in Corpus Christi Bay.  The powerful ability of voxels to visualize three-dimensional phenomenon has been demonstrated.  This model has yet to be improved because interpolation was performed in the three spatial dimensions..  The intention down the road is to construct a spatially and temporally continuous picture of the salinity behavior of the bay, which would provide insights into the hypoxia problem.  Several next steps for doing this have been identified and the author will pursue them upon completion of this project.

 

Acknowledgments

The author would like to thank Dr. Sanjay Srinivasan of University of Texas at Austin for his wonderful class on geostatistics and Dr. Paul Montagna of Texas A&M Corpus Christi for sharing his data.

References

1.         C. Ritter, P. A. Montagna. Seasonal Hypoxia and Models of Benthic Response in a Texas Bay, Estuaries, 1999. JSTOR.

2.         Cedric H. David, Ben R. Hodges. Deploying a Microstructure Profiler in Corpus Christi Bay, CRWR Online Report, 2006, University of Texas at Austin.

3.         Box, George E. P.; Cox, D. R. An analysis of transformations. 1964. Journal of Royal Statistical Society, Series B 26: 211-246. 


These materials may be used for study, research, and education, but please credit the authors and the Center for Research in Water Resources, The University of Texas at Austin. All commercial rights reserved. Copyright 2007 Center for Research in Water Resources.


Appendices

Appendix I

Appendix II

Appendix III

Appendix IV

Appendix V