Monte Carlo analysis of the Copano Bay fecal coliform model
Ernest Sin Chit To, CRWR

(Left picture from:http://www.iabeef.org/May/E.%20Clip%20Art%20&%20Photos%5CRed%20Comp%20Cattle%20Pairs-color.JPG
Right picture from:http://www.ocracokeonline.com/ocracoke/Ocracoke%20ferry/slides/seagull%20flock.JPG)
Characterization of input and parameter distributions
The Copano Bay watershed is situated on the Gulf of Mexico just north of Corpus Christi. The watershed spans an area of 5,688 sq km and is drained by four main rivers: Copano Creek, Mission River, Aransas River and Chitiplin Creek (see Figure 1). This area comprises mostly of rural areas where the dominant industries are crop farming and cattle rearing. Until recently, Copano Bay has been an a harvesting area for oysters. However, in 2002 the Texas Commission on Environmental Quality (TCEQ) declared the bay unsafe for oyster harvesting (TCEQ, 2002) when fecal coliform bacteria concentrations in the bay exceeded the water quality standards enacted by the Texas Department of Health (TDH). Significant sources of fecal coliform to the bay are farm animals in the watershed and birds that inhabit the coastal areas.
Figure 1. Overview of the Copano Bay watershed.
Total Maximum Daily Loads (TMDL)
Section 303(d) of the 1972 Federal Clean Water Act (CWA) requires that each State identify water bodies that do not meet the State’s water quality standards. For the Copano Bay watershed, the TCEQ identified the following portions of the bay as impaired water bodies (TCEQ, 2002):
· Mission River tidal (Segment ID 2001) : contact recreation impairment due to elevated fecal coliform bacteria and enterococci levels
· Aransas River tidal (Segment ID 2003): contact recreation impairment due to elevated fecal coliform bacteria and enterococci levels
· Copano Bay (Segment ID 2472): oyster water use impairment due to elevated fecal coliform bacteria and enterococci levels
Carrie Gibson and Dr. David Maidment of the University of Texas at Austin have performed a significant amount of research for the Copano Bay TMDL (Gibson, 2005) and have identified important bacterial sources in the Copano Bay watershed and their associated loadings. In addition, Carrie Gibson has developed a fecal coliform model in ARCGIS using model builder that predicts annual average fecal coliform concentrations in the bay. The development of this model is described in
http://www.crwr.utexas.edu/gis/gishydro05/index.htm.
Model of the Copano Bay Watershed
Gibson’s model uses an ARCGIS script called Schematic Processor (Whiteaker et al, 2003). Her model simulates the watershed as a network of pipes and junctions that transport fecal coliform loadings into Copano Bay. The “pipes” are referred to as schemalinks and the “junctions” or nodes are referred to as schemanodes (see Figure 2). Schemalinks are used to represent river segments and subwatersheds where flow takes place. While bacteria is travelling along a schemalink, it decays according to first order kinetics. When bacteria enters a schemanode, it is combined with other bacteria from other schemalinks that merge into the same schemanode.
Figure 2. Schemalinks and schemanodes of the Copano Bay fecal colifom model.
Figure 3 illustrates how a typical section of the watershed is modeled. It shows two schemalinks draining into one schemanode. The blue schemalink represents a section of the Aransas River and carries an upstream load (Lupstream) that is passed down from upper sections of the river. The green schemalink represents the generalized pathway by which fecal coliform loading in a watershed (Lwatershed) is transported into the downstream schemanode. As mentioned earlier on, bacterial kinetics are modeled in the schemalinks. Bacterial decay is modeled as a function of the load, decay rate and residence times of the subwatershed and the river section. The function is described in equation 1:
Ldownstream = Lupstream*exp(-Kd*Tau) + Lwatershed*exp(-Kd*Tau_w) (Equation 1)
Ldownstream = Downstream load (cfu/year)
Lupstream = Upstream load (cfu/year)
Lwatershed = Watershed load (cfu/year)
Kd = Decay rate (days)
Tau = Residence time of river section (days)
Tau_w = Residence time of watershed (days)
Lwatershed = Qwatershed * EMCwatershed (Equation 2)
Figure 3. Illustration of Schematic Processor
Because oyster water use standards are imposed on both on the median as well as the 90th percentile value of the fecal coliform concentrations, a model is needed to predict the probability distribution of the fecal coliform. For this reason, a Monte Carlo analysis was introduced into the model. Monte-Carlo analysis uses random numbers in a probability distribution to simulate random behavior. It is a useful tool for quantifying the uncertainty in a given property if provided with the variability in its causes.
Figure 4. The concept of Monte Carlo Analysis
It accomplishes this by running an existing model multiple times, each time changing the values of the model inputs and parameters by randomly sampling from their associated population distributions. Exhaustive sampling of the inputs and parameters is achieved by running many simulations (for Copano Bay, the number of simulations needed was on the order of 1000). After finishing the simulations, output values from all simulations are collected and the distribution of values is plotted and analyzed. In this way, the impact of the input variation on the output variation is observed. If the model characterizes the system properly, then the output distribution should match up with the distribution of the actual measurements of the output property.
Model calibration is achieved through matching the distribution of the model output (obtained through Monte Carlo simulation) to that of the fecal coliform monitoring data. The reasons for doing this are twofold:
1. To validate the input and parameter distributions against real data.
2. To ensure that model reasonably represents the system so that it could be used to forecast fecal coliform concentrations under different waste load scenarios.
Figure 5. Calibration of model output distribution to distribution of monitoring data.
Characterization of input and parameter distributions
Figure 6. Types of common distributions
(modified from http://www.decisioneering.com/monte-carlo-simulation.html (upper) and http://www.brighton-webs.co.uk/distributions/images/pdf_beta.gif (lower))
By using data analyses and professional judgment, the distribution of the inputs and parameters of the fecal coliform model were modeled with one of the above types of distributions. Explanations are given below:
The distribution of the cumulated runoff encountered at each schemanode in the watershed is derived using flow data from the 4 USGS gages in the watershed. The four USGS gages are 08189700 (Aransas River), 08189500 and 08189300 (both on Mission River), and 08189200 (Copano Creek). The measured flow distributions at the 4 USGS gages were approximated using lognormal distributions. By adjusting the median and the coefficient of variation of the lognormal distributions the modeled values were matched to the measured flows. The resulting lognormal distribution of each gage was then normalized by its median and then multiplied to the median runoffs of other SchemaNodes in the same river. This creates a unique flow distribution for each SchemaNode in the system. The median cumulative runoffs for the schemanodes were previously calculated by Carrie Gibson in her schematic processor model. They were calculated using flow proration based on subwatershed areas.
Figure 7. Measured and simulated flow cumulative distributions at USGS gauge 08189700.
Event mean concentrations (EMCs)
The implementation of Monte Carlo analysis utilizes a Gibson’s model to perform fecal coliform load calculations based on given inputs and parameters (see Figure 8). Gibson’s model is based on Schematic Processor (Whiteaker et al, 2003). The model reads information from two tables, SchemaNode (Figure 9) and Schemalink (Figure 10). SchemaNode contains the fecal coliform loads and cumulative runoffs at each schemanode. Schemalink contains decay rates and residence times along each schemalink.
To simulate variation, a random number generator was programmed to generate values for cumulative runoff (Q), event mean concentrations (EMC) and (Kd) by sampling from their respective populations (see Section 5). These sample values are then inputted into the Schemanode and Schemalink tables and then fed into the Schematic processor. Since Monte Carlo analysis requires the running of this process many times, a loop is programmed into the model to perform as many simulations as needed by the user. Once all iterations are completed, the program writes the results to a table and then plots the cumulative distribution. If monitoring data is available, then the program plots them alongside the model results for comparison (see Figure 5).
Figure 8. Implementation scheme for Monte Carlo Analysis
Figure 9. Modified schemanode table for Monte Carlo Analysis
Figure 10. Modified schemalink table for Monte Carlo Analysis
The original intention was to perform the Monte Carlo analysis in Model Builder. However Excel was considered more suitable because of the following reasons:
1. it has built-in procedures that can sample from different classical probability distributions;
2. it has built-in graphing capabilities; and,
3. the analysis could potentially run faster because it takes less time to update spreadsheet cells then to upadate an access database.
Therefore the Monte Carlo analysis was done in Excel. The reader is encouraged to test out the model by following the instructions in Section 7. Please feel free to report any bugs to Ernest To at eto@mail.utexas.edu.
This section describes the steps for downloading and running the Monte Carlo model:
Step 1: download the model from the following location: http://webspace.utexas.edu/~tosc/Copano_Bay_Monte_Carlo_Analysis.xls
Step 2: Set the security settings of Excel to medium. This is to allow the various macros in the Monte Carlo model to run. To do this, first open up Excel, go to tools, options and then select “medium” security settings. After that, close out of Excel and then open up “Copano_Bay_Monte_Carlo_Analysis.xls”. Select “enable” when Excel prompts you about enabling macros.
Step 3: Once the model is opened in excel, go to the worksheet labeled, “Control Sheet”. You will find the following graphical user interface (Figure 10). It consists of a map that displays the hydro IDs of the SchemaNodes (in black numerals) and SchemaLinks (in red numerals) of the Copano Bay watershed. It also shows a list of SchemaNodes on the right hand side where monitoring data were collected. Finally, in the bottom left corner, there is an area where the user can specify which SchemaNode he or she is interested in and the number of simulations desired. The user can enter the HydroID of any SchemaNode in the watershed. Once the user enters in these two pieces of information, he or she can hit the “Monte Carlo Analysis” button to execute the model.
Figure 10. Graphical user interface for Monte Carlo analysis
Step 4: Once the model has finished, it plots the cumulative distribution of the modeled fecal coliform concentrations with the monitoring data (if available) in the chart sheet, “Monte Carlo Graph”. See Figure 5 as an example.
Results and discussion
This section shows the calibration results for the monitoring stations along Aransas River. The cumulative distribution of modeled fecal coliform concentration is denoted by a solid line while that of the monitoring data is denoted by green dots. Calibration went smoothly in generally for schemanodes in the watersheds. The ranges of the decay rates were fine-tuned while fitting the model to the data. The range of Kd values is between 1.5 to 2.5 day-1.

Figure 11a. Model vs Measured Fecal coliform distributions at Schemanode 61.

Figure 11b. Model vs Measured Fecal coliform distributions at Schemanode 68.

Figure 11c. Model vs Measured Fecal coliform distributions at Schemanode 75.

Figure 11d. Model vs Measured Fecal coliform distributions at Schemanode 154.

Figure 11d. Model vs Measured Fecal coliform distributions at Schemanode 153.
Calibration of Schemanode 154 as well as other schemanodes in the bay proved to be difficult. For example, in order to fit the median of the model to the median of the monitoring data at SchemaNode 154, the decay rate for the node had to be increased to ~12 day-1. This value is significantly higher the range of 1.5 to 2.5 day-1 used elsewhere in the watershed. On the other hand, for SchemaNode 153, the decay rate had to be dropped to the range of 0.5 to 1.5 day-1 in order to fit the data. This seems to suggest that some mechanisms in the bay may not be adequately modeled. Considering that the model currently treats the four Copano bay segments as unconnected batch reators, factors such as flow between bay segments and tidal dispersion have been omitted. If time and budget allows, a simple hydrodynamic model may be created to investigate the significance of these factors.
Monte Carlo is a powerful tool that can be used to model the uncertainty or variation in environmental phenomenon. It can assist the TMDL process, particularly when water quality standards are imposed not just on mean pollutant concentrations but also on some property of the distribution of the pollutant (e.g. percentile values). A meaningful Monte Carlo analysis is dependent on 1) having a model that adequately describes the physical system and 2) having adequate information on the variations of the inputs and parameters of the model. Neither of these are trivial tasks, especially when data is sparse. This project is still in progress, and it is hoped that, when completed it may become a useful tool for forecasting fecal coliform behavior in different parts of the Copano Bay watershed and the subsequent determination of waste load reductions.
I would like to thank Carrie Gibson for her great help in this project, especially in supporting the analysis with the data she collected and providing valuable feedback. I would also like to thank Dr. David Maidment for his guidance and the opportunity to work on this project.
1. Gibson, Carrie J., 2005, Schematic Processor Bacterial Loadings Model, GisHydro05, http://www.crwr.utexas.edu/gis/gishydro05/Modeling/WaterQualityModeling/BacteriaModel.htm
2. Maidment et al, 1993, Handbook of Hydrology, New York: McGraw-Hill
3. Texas Comission on Environmental Quality, ca 2003. Developing and implementing Total Maximum Daily Loads, ftp.tnrcc.state.tx.us/water/quality/tmdl/31-TMDL101.pdf
4. Texas Comission on Environmental Quality, Nov 2005. Copano Bay: A TMDL Project for Bacteria in Oyster-Harvesting Waters,
http://www.tceq.state.tx.us/implementation/water/tmdl/42-copano.html
5. Whiteaker, Timothy, 2003, Schematic Network Processing, http://www.crwr.utexas.edu/gis/gishydro03/Schematics/SchematicNetwork.htm
Ernest Sin Chit To
Center for Research in Water Resources
e-mail: eto@mail.utexas.edu
These materials may be used for study, research, and education, but
please credit the authors and the Center for Research in Water Resources, The