Using GIS to Aid in Modeling the 

Susquehanna River Basin

GIS in Water Resources Term Project

CE 394K.3 - Fall 2000

Monica Jeanne Wedo



  1. Background on the Susquehanna River Basin

  2. Project Goals and Motivation

  3. Obtaining Data

  4. Determining Upstream Distance Using ArcView 3.2                    

  5. Completing the Flow Network Using ArcInfo

  6. Comments on Results and Future Work

  7. Websites of Interest


Background on the Susquehanna River Basin

    The Susquehanna River Basin, spanning roughly half the land area of Pennsylvania and portions of New York and Maryland, remains an important water resources for the Northeastern United States.  With a drainage area of 27,500 square miles, it accounts for 45.7% (24,272 miles) of Pennsylvania's stream miles.  The Basin, which is comprised of about 60% forested land, also comprises 43% of the Chesapeake Bay's drainage basin.  As one of the nations most flood proned areas, it experiences a major devastating flood on the average of every 20 years, keeping the economic costs of managing the Basin high.  To illustrate, in 1993, the basin's average annual flood damage was $113 billion dollars.  The Lower Susquehanna Basin, the southern most subbasin of the Susquehanna River Basin depicted below, comes under considerable scrutiny as the most developed and populated area within the Basin.  This subbasin is not only known for its productive agricultural industry, but also as a major production area for electricity.  The Susquehanna River flows 444 miles through the Susquehanna River Basin from its headwaters near Cooperstown, NY to Harve de Grace Maryland where it meets the Chesapeake Bay.  As the largest tributary to the Bay, the River provides 90% of the freshwater inflow to the upper half of the Bay and 50% overall with an average daily inflow of 22 billion gallons.  In fact, the Bay is actually the submerged area of what was once the Lower Susquehanna River. 



    There are two main organizations interested in preserving the health of the Susquehanna River Basin.  The first is the Susquehanna River Basin Commission (SRBC), which is a governing agency established under a 100-year compact in 1970 between the federal government and the states that comprise the basin, Pennsylvania, New York, and Maryland.  The motive for the Commission's creation is that the river basin borders the major population centers of the east coast, and although relatively undeveloped, has experienced problems of water pollution and overusage.  Because the Susquehanna River flows through three states and is classified as a navigable waterway by the federal government, there are state, regional, and national interests involved.  There remains a need to coordinate the efforts of three states and the agencies of the federal government, as well as a need to establish a management system to oversee the use of the water and related natural resources of the Susquehanna.  The SRBC main activities include flood damage relief, ensuring a continual source of freshwater to the Chesapeake Bay, water quality monitoring, ensuring a safe water supply, and wildlife habitat protection.  For more information please refer to the SRBC website.



   The second organization is the United States Geological Survey's (USGS) National Water Quality Assesment Program (NAWQA), where 60 of the nations most important aquifers and river basins are being investigated to assess historical, current, and future water quality concerns.  One of the programs main objectives is to describe relationships between natural factors, water quality conditions, and human activities with the hope that a strong, unbiased basis for water resources decision making at the local, state, and federal level will be established.  As one of the study units of the NAWQA program, The Lower Susquehanna River Basin was under intense investigation from NAWQA beginning in 1991 through 1997.  The online summary of major water quality issues from 1992-1995 includes information on contaminent levels for nitrate/nitrogen; concentrations of pesticides in ground waters and streams; and also radon, volatile organic compounds, and bacteria in groundwater.  For more information please refer to the USGS website.



Project Goals and Motivation

   The motivation for this project originates from my undergraduate work at the University of Virginia.  While working and learning about the field of water quality modeling under Dr. Lung,  I was surprised to find out how stream distance was measured from one point to another along a waterbody.  Although the USGS maintains a remarkable database available online containing historical data for all USGS gages, they can not possibly give relative locations to other gages or physical places because the possibilities are limitless This project intends to show that GIS can not only be useful for visually displaying the boundaries and waterbody features of a particular study area, but can also provide pertinent information to the modeler, such as an accurate distance downstream to the mouth of a river and flow data.  Using the Susquehanna River and the West Branch Susquehanna River, this project accomplishes the following tasks:


Obtaining Data

   The shapefiles required for this project described as follows:


Determining Upstream Distance Using ArcView 3.2

    To begin, the Huc02_250k.shp and the Rf1_02.shp were added into a new view in ArcView 3.2.  In a second view, the statesp020.shp and the county020.shp file were also opened.  Since the extent of the state and county files covered the entire United States, the query tool, shown below, was utilized to select only those states and counties with Pname = to Pennsylvania, New York, New Jersey, Delaware, and Maryland.  


Query Builder


View of the states and counties of PA, MD, DE, NJ, and NY in decimal degrees


Since the goal was to project all four shapefiles onto one map, the projection of the state and county files had to be changed from geographic coordinates (decimal degrees) into the same projection of the HUC and river reach files.  Using the ArcView Projection Utility Wizard, the geographic coordinates were altered to the Albers projection as shown below.  Note the parameters utilized for the Central Meridian, the Central Parallel, and Standard Parallels 1 and 2.




View of HUC Unit 02 study area in Albers projection


    Once complete, all four shapefiles were projected onto the same view as shown above.  The yellow section highlighted on the map is subregion 5 of HUC unit 2, meaning that for each HUC unit in this region, the first four digits of the eight digit number will always be 0205 to delineate this area.  Subregion 5 also happens to delineate the entire Susquehanna River Basin.  Notice that the south end of the Susquehanna Basin connects to the Chesapeake Bay, which is where the Susquehanna River opens up into the Bay and provides the most significant source of freshwater inflow.  Using the query tool once again and the Theme/Convert to Shapefile command, another view was created to include only subregion 5, the Susquehanna River, and the West Branch Susquehanna River. 


View containing Subregion 5 and the Susquehanna River


    Once the base map of the study region was completed, the flow and gage information needed to be obtained.  The first task was deciding which days to choose for conditions of low, average, and high flow.  The reasoning behind using three flow conditions is that often in water quality modeling or in floodplain modeling, the extreme as well as the average conditions need to be investigated so that appropriate decisions and policy can be implemented.  Knowing that in the mid-1990s the Northeastern United States suffered flooding in the spring of 1994 and a severe drought in the summer of 1995, various gaging sites where investigated to choose the optimum flow conditions.  This was accomplished by choosing various gaging sites along the Susquehanna River within the USGS website and plotting flow in cfs vs. time for 1994 and 1995.  



After investigating multiple plots like the one above for station 01570500 in Harrisburg, PA, the following days were selected for each flow condition.  


Low Flow (cfs) - 9/09/95

Average Flow (cfs) - 3/24/95

High Flow (cfs) - 3/26/94


At this point, an Excel spreadsheet was created to hold all of the station and flow data.  By choosing the link on the USGS website to navigate stations grouped by basins in Pennsylvania, New York, and Maryland, all of the stations along the Susquehanna and the West Branch Susquehanna River were easily identified.  The list of basins to search can be found in the attribute table for Huc05.shp, the shapefile created from the original Huc02_250k shapefile.  To illustrate, an example link for Harrisburg is given below.  From here the information for station 01570500 at Harrisburg can highlighted and copied into Excel.  Once completed, then the gage site can be selected.  



Once the station is clicked on, the following screen appears.  After this additional information, including latitude and longitude, is highlighted and copied into Excel, the user can then select Historical Daily Streamflow Daily Values, choose to search between 01/01/1994 and 12/31/1995, and opt for the output to be displayed as a tab-deliminated text file.  Scrolling down to the three dates listed above, the flow values in cubic feet per second (cfs) can be copied and pasted into the spreadsheet.



    The next step involved calculating the geographic coordinates for each station from the dd (degrees), mm (minutes), and ss (second) values obtained from USGS.  Using the following formula, the geographic coordinates for all stations were determined.  An example is given below for the Harrisburg station.  Once completed, the spreadsheet containing all of the downloaded information plus the new decimal degree values was saved as a dbf 4 file.  

    Decimal Degrees (DD) = Degrees + Minutes/60 + Seconds/3600                                                                                                     

    For Harrisburg, this translates into DD = 76 + 53/60 + 11/3600 = -76.88639 degrees Longitude

                                                  DD = 40 + 15/60 + 17/3600 = 40.25472 degrees Latitude

Returning back into ArcView 3.2, the dbf 4 file, Susgages.dbf, was added to the project.  Using the commands View/Add Event Theme and then Theme/Convert to Shapefile in a new view, the gages were displayed as points.  Next, using the procedure described above for converting the state and county shapefiles into the Albers projection, the Susgages shapefile was projected into the view with subregion 5, the Susquehanna River, and the West Branch Susquehanna River. 


    View of USGS gaging sites in decimal degrees converted to a shapefile



View of subregion 5 with USGS gaging sites projected


    This type of map could be of great use to a water quality modeler.  By opening the Susgage.shp attribute table or by clicking on a gage with the information button depressed, statistics on flow conditions as well as geographic location and drainage area is easily obtained.  However, one crucial piece of information is still absent from these tables, distance downstream to the mouth of the Susquehanna River.  This task, although tedious, was accomplished with a higher degree of accuracy than the string-and-map method.  This step required that another spreadsheet be created in Excel to organize the numbers generated in ArcView 3.2.  By making the Susriver.shp theme active and depressing the information key, each river reach was selected starting with the reach just north of the entrance to the Chesapeake Bay.  From the Identify Results table the length of the reach in meters and the Fnode # (for identification purposes) was transferred into the Excel spreadsheet.  


Identify Tool

Procedure for selecting river reaches with the Identify tool to determine reach length in meters.


This process continued for every reach along both branches of the river.  Every time a USGS gaging station was passed, all previous reaches south of the gage were summed to obtain the total distance from the gage to the mouth of the Chesapeake Bay.  However, if a gage lay within the middle of a reach, then the fractional distance upstream of that reach was determined by inspection and included in the total summation.  The Excel charts generated from this process are show below. 



   Once all the downstream lengths for both rivers were determined, the Excel spreadsheet that was originally saved as a dbf 4 file was expanded to include the distance from the Chesapeake Bay in both km and mi for each gage.  Finally, this modified table was transferred into ArcView 3.2 and joined to the original Attributes of Susgages.shp table using the Table/Join function.  Finally, to illustrate the usefulness of having flow data and distance downstream as part of the attribute table, graphs of flow (cfs) vs. distance downstream (mi) for both branches of the river were generated with the data.  These plots illustrate the importance of considering different flow conditions in a model.  The difference between the low flow condition vs. the high flow condition in the Susquehanna River as it enters the Chesapeake Bay can differ in flow by 350,000 cfs, a considerable amount when the average flow condition should be roughly 35,000 cfs!




Completing the Flow Network using ArcInfo

    To anyone interested in the flow characteristics of a particular body of water, it is often useful to have a complete picture of what the system looks like.  A flow network provides this image and consists of two main features, junctions and edges.  Gaps in a flow network and misplace gages can be confusing and should be fixed when possible.  Using the capabilities of ArcInfo, the flow network for the Susquehanna River Basin was completed through the following steps.  To begin, the network consisting of the Susquehanna River and the West Branch Susquehanna River was extended in ArcView 3.2 to include all streams and rivers containing a USGS gaging station.  The reasoning behind choosing only waterbodies that have gages somewhere along their length is simply that for most analytical analysis and mathematical models, only reaches that have measurable characteristics can be included.  Thus, it is most practical to take into account only those waterbodies for which data is easily obtainable.  Using the query builder, any river or stream that had a gaging station in the Susquehanna River Basin on the USGS website was selected with the query Pname = "name of waterbody".  The new shapefile created from this process is shown below.


    View with Susquehanna River Basin Network and 34 USGS gaging stations


    Switching to ArcInfo's ArcCatalog, a new Personal Geodatabase, SusquRiverBasin was created with a feature dataset entitled SusquRiver.  Two shapefiles were added using the command Import/Shapefile to Geodatabase.  They were Ultimatesusq.shp and Gages.shp.  In addition, an outlet feature class, outlets_2, was created to account for the intersection of Susquehanna River with Chesapeake Bay.  Essentially, these outlets act as sinks in the network, which is a junction where flow terminates or drains out of the network, thus giving a boundary condition to the network.  From the SusquRiver feature dataset, a new geometric network was created using the geometric network wizard.  The characteristics of the flow network as prompted by the wizard are given below.  Once the network was created, two additional files appeared on the left hand side of the screen under the feature dataset SusquRiver to describe the new network.  They are the SusqRiver network file and the SusqRiver_Junctions file.

    Moving in to ArcInfo's ArcMap allows editing of the flow network and the performing of trace tests.  Upon opening the program, the geometric network created above, SusquRiver, was added and the editor toolbar was opened.  The reason the SusquRiver network required editing is because any outlet, or sink, must have an Ancillary Role value of 2 to indicate to the program that its role is a sink.  The default Ancillary Role value for the outlets was initially set to 0, indicating a role of None (the other option is 1 indicating a role of Source).  By depressing the Start Editing option in the Editor dropdown menu and by setting the Edit Target box to Outlets_2, the Attribute Table for Outlets was opened and the Ancillary Role values were set equal to 2.


    Editor Toolbar with the target set to outlets_2



Zoomed in view of the bottom of the geometric network where the outlets, having an Ancillary Role edited to a value of 2, were added.


    Next, the Utility Network Analyst toolbar was added to the view for flow analysis.  The trace task that was performed for this analysis was the Trace Upstream, which finds all network elements that lie upstream of a given point in the network.  This test is useful because it not only shows what how water flows through the network, but it also indicates when there is a disconnection in the network by terminating the trace test if flow direction cannot be continued.  Before beginning the test, two junction flags, as indicated by the green squares in the figure below, were added at the outlets to indicate that flow originates at those particular junctions in the network.  Also, a few general parameters needed to be set in the Analysis/Options menu.  On the General Tab, Directed trace task include edges with indeterminate and uninitialized flow was applied and on the Results tab, the Results Format was changed to Selection.  Finally, the button indicated in the toolbar below was depressed to set the flow direction and the blue trace button was selected to begin the test.


Utility Network Analysis Toolbar



Results of first Trace Upstream test for the Susquehanna River Basin


    This first result in the Trace Upstream test indicated that there was a missing edge between to junctions in the flow network, thus stopping flow between them.  As shown below in the zoomed in view of a network break, these missing edges required correction.  Using the create new feature button on the editor toolbar (looks like a pencil as shown above), an edge was added between the two disconnected junctions to complete the flow.  This process of running the Trace Upstream Test and connecting junctions with a new edge was performed until the entire Susquehanna River network was complete.  Finally, as illustrated in the second figure below, any gages that did not lie on the outer edge of the flow network (in the case of the lower Susquehanna River below the convergence of the Main Branch with the West Branch) or on the single flow path (in the case of the northern portions of both rivers) were snapped to the edges on the network to provide a more accurate depiction of the Susquehanna River Basin.  The final result of the Trace Upstream showing the completed network is shown in the third figure below.


Using the editor toolbar to draw in and edge between two disconnected nodes



Snapping the stream gage stations to the outside edge of the lower Susquehanna River



Completed Susquehanna River Basin stream network



Comments on Results and Future Work

    Through the use of GIS, gathering data for a water quality study could easily be enhanced by expanding on the ideas presented in this project.  For example, additional measurements, such as pollution concentrations, alkalinity, and turbidity, already gathered by the USGS could easily be incorporated into ArcView in a variety of ways.  Beyond simply entering the desired information into an attribute table, perhaps new themes could be created to illustrate how concentration varies with location in the watershed.  Coincidentally, this idea was an original goal for this project, but had to be modified due to incomplete and choppy data within the time period of interest, 1994 and 1995.  

    Although the goals of this projected were successfully completed, the tediousness of determining distances along a network using the identify tool in ArcView 3.2 suggests that improvements to this process are preferable if any database with gage information is to be effectively utilized by a water quality modeler.  This idea is currently being investigated by another graduate student, Kristina Schneider, who discusses a method called linear referencing using the National Hydrography Dataset in ArcInfo 8.1.  Using linear referencing, once an address is assigned to each point of interest in a network, the distance between those points can be determined.  For more information on this topic, please see Ms. Schneider's term project report.


Websites of Interest

1.  National Water-Quality Assessment (NAQWA) Program:

2.  The Susquehanna River Basin Commission:

3.  The United States Geological Survey (USGS):

4.  The water division of the United States Geological Survey (USGS):


Back to the GIS Class Page