Arc Hydro Timeseries and GenScn
Nathan Johnson
CRWR
Table of Contents
The Arc Hydro framework is being used to store timeseries data related to hydrology for several large projects in the US. The South Florida Water Management District (SFWMD) is using the database structure to store over 2 million records of rainfall and flow timeseries and serve them internally to interested parties. The Bexar Regional Watershed Management Coalition (BRWMC) has chosen the Arc Hydro framework as the database structure to store their spatial and timeseries data for the Regional Watershed Modeling System (RWMS) for the San Antonio area.
GenScn (GENeration and analysis of model simulation SCeNarios) is a computer program originally developed by Aqua Terra Consultants for the USGS, but presently distributed with the EPA's BASINS non-proprietary software. GenScn is designed to assist with the process of analyzing and managing the high volumes of input and output timeseries from complex hydrologic and water quality models. It has been used primarily as a postprocessor for the Hydrologic Simulation Program in Fortran (HSPF), but was designed to be extensible and could potentially be used to analyze and manage timeseries from other models.
The Hydrological Simulation Program - FORTRAN (HSPF) is a set of computer codes that can simulate the hydrologic, and associated water quality, processes on pervious and impervious land surfaces and in streams and well-mixed impoundments. The model is continuous and is driven by timeseries for precipitation, evaporation, and any other forcing data required for the water quality subroutines. The model primarily uses timeseries in the .wdm (Watershed Data Management) format, a binary, direct-access file with timeseries organized into discrete data sets. The .wdm file format consists of an array of dates, an array of values, and a set of properties of the timeseries including information about the spatial location and type of data. This type of timeseries structure is common for driving models both because of computational simplicity and minimal disk space requirements.
The code behind GenScn is public domain, and work at CRWR has focused on developing tools to allow Arc Hydro timeseries to be available to HSPF and the GenScn program.
The GenScn program is distributed with the BASINS program at the EPA's website: http://www.epa.gov/OST/BASINS/. The BASINS suite of programs consists of several components, one of which requires the ESRI software ArcView 3.x. (New Arc 8 and 9 software is not yet supported (http://hspf.com/pdf/BASINS4.pdf) Other components include HSPF, PLOAD, and the GenScn/WinHSPF/WDMUtil programs. A version of GenScn is available outside of the BASINS package at: http://hspf.com/pub/GenScn/. Installation instructions can be found at: http://www.epa.gov/waterscience/basins/bsn3faqs.htm#win. Instructions for use of the program are distributed with the software.
GenScn organizes timeseries based on three key attributes. Location, Scenario, and Constituent. The following figure shows the GenScn window and an example of timeseries selected using these three attributes.

GenScn has been set up to read the output from an HSPF model for Leon Creek near San Antonio. In the above picture, timeseries have been selected for 4 rivers in the watershed (in the 'Locations' frame), for the 'leon' scenario (in the 'Scenarios' frame), for the constituent 'RO' (in the 'Constituents' frame, RO is river outflow in cfs). These 4 timeseries are shown in the 'Time Series' frame. Timeseries listed in this frame can be selected and plotted or analyzed with the tools at the bottom in the 'Analysis' frame. Available tools include 'Graph', 'List Values', 'Perform Duration Analysis', 'Compare', 'Generate', 'Animate', and 'Profile Plot'.

In addition to managing data from existing models, GenScn contains the capabilities to create new scenarios. With this functionality, the hundreds of timeseries created during the many calibration runs of an HSPF model can be managed and analyzed efficiently.

Arc Hydro Timeseries Structure
The timeseries structure for Arc Hydro has a different format than those of traditional timeseries data formats used to drive hydrologic models. In the Arc Hydro format, a single table contains all the timeseries records in the database. Spatial information is present on each timeseries record in the form of an 'FeatureID' corresponding to a spatial feature in the geodatabase. In addition to having spatial data on each timeseries record, the Arc Hydro format also includes metadata on each record in the form of a 'TSTypeID' corresponding to a record in the TSType table.

Though not the most efficient way to store timeseries in terms of disk space (FeatureID and TSTypeID are repeated for every record of a timeseries), this format allows queries on the single 'Timeseries' table through time, space, or variable. In a sense, the Arc Hydro timeseries structure creates a three-dimensional space consisting of time, space, and variables, allowing queries through any of these domains. The following figures illustrate this concept.

a) Single Timeseries Value b) Single Timeseries Value with Arc Hydro Attributes




c) Single point in space, all time/type d) Single type, all space/time e) Single timeseries, one location/type f) Single DateTime, all space/type
GenScn was developed using Visual Basic and extensive libraries are available from Aqua Terra for storing and performing calculations on timeseries. All timeseries contain certain common components such as date-time, values, location, units, etc. The structure that the GenScn uses to store these components is summarized in the following figure:

The above figure is not exhaustive, but gives the basic idea of how timeseries data is stored in the GenScn Timeseries Structure. TserFile is an object that contains a collection of timeseries and TserData is a single timeseries object in that collection. Essential attributes for each timeseries are stored in a DataHeader object. Scenario, Location, and Constituent define the physical properties of the timeseries, and DSN (data set number) defines the timeseries' location in a .wdm file. Additional attributes are stored as a collection of Attributes, and Values are stored in an array. The TserDates object contains information about the interval, timestep, regularity and other time-related information in its DateSummary object and also array of DateValues (corresponding to the TserData's Values array)
Currently a class that implements TserFile has been developed for the Arc Hydro timeseries format. This means that tools can now be used to read from the Arc Hydro timeseries format into the GenScn timeseries structure, or TserFile structure shown above. Though the tools are still under development, several functions already exist. If the latest GenScn/WinHSPF tools have been installed on a computer, the executable located at: Preliminary ArcHydro2wdm Tool should run on the machine. A sample database with Arc Hydro timeseries is available here. This program has been mostly used for developing the tool, and is not fancy. Before explaining the tool's functionality, the method of transfering attributes from Arc Hydro timeseries structure to GenScn timeseries structure will be discussed.
The figure showing the Arc Hydro timeseries structure demonstrates how attributes are stored in the geodatabase. FeatureID and TSTypeID are stored on each record of a timeseries and other metadata are stored in a corresponding record of the TSType table. A query for a unique combination of TSTypeID and FeatureID will yield a single timeseries from the table, for instance, "Select from the Timeseries table all records that have FeatureID = 1001 and TSTypeID = 2" This statement would return all records with the same type (TSTypeID) at the same location (FeatureID). The records returned from the timeseries table for this query would comprise the list of DateTime's and Values for the timeseries. The corresponding record from the TSType table would contain all the other metadata for the timeseries. This is a simple explanation of the method used for extracting unique timeseries from an Arc Hydro geodatabase. The results of this type of query on an Arc Hydro database can then be used to populate the GenScn timeseries structure shown above.
The following table presents a summary of where the attributes from Arc Hydro timeseries structure are stored in the GenScn structure:

Several issues were dealt with in transferring attributes from one representation to the other. In the Arc Hydro database structure, the FeatureID is an integer corresponding to a unique ID in the database. This number is used as the Location attribute in the GenScn structure but may not have any meaning outside of the database. Each structure has its own way of representing the interval of the timeseries, and care was taken to preserve the correct timestep and time units for the data. Data Type, such as incremental, average, or instantaneous, do not have perfect parallels between the two formats so it is possible that some information will be lost in the transfer. The current version of the Arc Hydro structure has only two possibilities for the Origin of the timeseries, 'Recorded', and 'Generated'. This somewhat limits the structure's ability to store information about what type of 'generation' was used to derive the timeseries, and future versions of the database will allow for other "Origins."
The "Read From Geodatabase-Write to .wdm" functionality of the program is fairly well-developed, however, the "Read From .wdm-Write to Geodatabase" functionality is still under development. The following is a brief description of the aforementioned Arc Hydro to wdm Tool.
There are no guarantees on any of these prototype tools. Please back up your data before trying anything.
The latest-greatest version of GenScn (v2.3) must be installed for these tools to work correctly.

Read Timeseries from Arc Hydro Geodatabase: reads data from the database stored at the location entered in the text box to it's right. populates the listbox below it with the available timeseries types in the database.
Available Types of Data: Choose one: when a TSType is selected, the FeatureID's that have this type of timeseries are filled in the box to its right.
Available Features with specified type: each record in this listbox represents a single timeseries from the database. Records may be selected and added to the "Timseries to be written to specified wdm file" listbox.
Add to wdm List: pushing this button will add the timeseries selected above to be added to the "Timseries to be written to specified wdm file" listbox. Multiple types may be added by selecting a different type from the "Available Types of Data" listbox and adding additional timeseries.
Clear wdm List: pushing this button will clear the "Timseries to be written to specified wdm file" listbox.
Write to .wdm file: pushing this button will write the timeseries at the left (no selection is required) to the wdm file in the text box directly below.
New wdm file Location: this text box specified where a new, blank wdm file is kept. It is recommended to always keep this file, or a copy of this file to be used as a template for new wdm files. This file will be copied to the "Write to .wdm File" location and then populated with the selected timeseries.
Number of Records written to wdm file: this list box will contain the number of records for each timseries written to the wdm file. The DSN assigned in the wdm file will begin at 1 and be ordered by sorting the unique timeseries in the database by TSTypeID and then FeatureID.
The lower part of the form is less developed and only contains basic buttons to read a wdm file and write the contents to a GDB in Arc Hydro format.
Read Wdm For Arc Hydro: pushing this button will read the wdm file specified at the right and list the attributes in the first list box further right.
Write to Arc Hydro Database: pushing this button will write the timeseries just read previously (by the button above) to the GDB location specified at the right and populate the second list box at the far right.
It should be noted that this "Write to geodatabase" functionality is still being developed and it may take a VERY long time to write more than a few long timeseries to the GDB. Additionally, the "Read Wdm For Arc Hydro" seems to have trouble getting the units and timestep for the first timeseries read. The geodatabase specified to be written to must have a minimum of the two timeseries-related tables following the Arc Hydro timeseries structure. Timeseries and TSType records will be added to any already present in the GDB. An attempt to write any wdm file that has a non-numeric Location attribute to the GDB will result in a 0 being written in the FeatureID field which is not very helpful. To create unique timeseries in the GDB, first edit the wdm datasets to have a numeric (integer) location attribute.
The development of the timeseries tools described above has been mainly motivated by a desire to use Arc Hydro timeseries to drive HSPF models. Nathan Johnson's masters thesis (to be finished in the spring of '05) will contain many tools to preprocess spatial data for HSPF in ArcGIS9, and additional (more rigorous) tools to deal with transferring data between wdm and Arc Hydro format. When it is finished it will be available on the web at: http://www.crwr.utexas.edu/online.shtml
This page was created by:
Nathan Johnson
Center for Research in Water Resources
The University of Texas at Austin
njohnson@mail.utexas.edu
https://webspace.utexas.edu/nwj58/index.html
These materials may be used for study, research, and education, but please credit the authors and the Center for Research in Water Resources, The University of Texas at Austin. All commercial rights reserved. Copyright 2005 Center for Research in Water Resources.