9
Database SetupOverviewInstructions for Deriving Flow Direction Grids Instructions for Setting Up and Checking Regression Database Delivered by MTU Overview The database required to support threshold runoff calculations nation wide is complex. The two most significant components of the database that were not readily available at the start of this project (January, 1999) were projected flow direction grids for each RFC, and a digital database of terrain and climatic characteristics required to support USGS flood frequency calculations. Flow direction grids were derived using DEM and RF1 data provided by NOHRSC, and data layers required for USGS flood frequency calculations were compiled at Michigan Technological University. A significant amount of work is required to (1) derive usable and accurate flow direction grids, and (2) check the database delivered by Michigan Tech against USGS Water Resources Investigations Report 94-4002 and make modifications necessary to conform with the specifications required by the Avenue programs developed at HRL. The threshold runoff Avenue programs are designed to be fairly general. This requires a database which conforms to strict specifications that must be carefully checked. Some of the tasks are tedious but definitely worthwhile. A summary of the flow direction pre-processing and some of the reasoning behind the revisions to earlier plans is summarized here, followed by more detailed instructions on how to execute this pre-processing procedure. Summary of pre-processing steps for importing, projecting, and stream carving: 1. Convert NOHRSC river line files (modified RF1), elevation raster,
and slope raster files into Arc/Info format.
A variant of processing steps 1-12 were initially run at Michigan Tech University for all RFCs. As time has passed, several reasons to refine this procedure have been discovered; therefore, re-running some of these steps is required. It should not be a huge burden on resources to repeat these processing steps because they are automated and the 15 arc-second grids are relatively small in the DEM world. The computational time for processing steps 1-12 is on the order of 12 hours (depends on many factors), but the most computationally expensive processes can be run overnight. There are several reasons for refining this procedure that revealed themselves by working with and checking the data at HRL. - In the original procedure (Spring 1999), the buffer length selected to clip grids was 30 cell widths outside the RFC boundary or 12000 m. This buffer is not long enough at the edge of one portion of ABRFC because it does not allow the processing routines to account for the effects of the closest RF1 stream outside of the RFC basin. The presence or absence of this particular river segment just outside ABRFC influences the assignment of flow directions on a relatively flat area near the RFC basin divide (this may be an artifact of poor DEM or the filling process). To avoid this problem, a larger buffer of 50000 m is selected for re-processing. - Some of the NOHRSC DEMs use more than one code to denote data outside the study area. The automatic routines used in the first round of processing assumed that the code 32768 was the only "NODATA" code used. The code 65436 was also used in NODATA in MARFC. This means restarting from step 2 is required for some RFCs. - It is worthwhile to refine the stream carving method so that instead of setting stream cells to 0 as in the original method, land surface cells are raised relative to river cells. The current differential used to raise the land surface cells is 100 m. Setting river cells to 0 may cause more serious problems than using a constant elevation differential in situations where two headwater arcs that drain to different basins originate in neighboring DEM cells. Also, some land cells in coastal areas are rounded down to zero elevation, thus it became difficult to distinguish these cells from river cells using the old method. All ocean (or large lake) cells have the value NODATA. - For RFCs that border on large bodies of water (i.e. oceans or Great Lakes) it may be necessary to edit (lengthen) the river arcs along the coast. This is a manual procedure (explained below), but it goes pretty quickly. Latest Procedures (as of Dec. 2, 1999)
1. Extract the data for an RFC from the CD-ROM delivered by Michigan Tech. Keep data for different RFCs in separate directories. 2. Run prepro1.aml. To run, follow the instructions in the header of the prepro1.aml file. This AML may take many hours to complete since it includes use of the "fill" Grid function. The buffering step can also be expensive for coastal areas. There is one interactive step that is required before the filling occurs. A Grid graphics window will display a mask of the watershed and the user is required to select the smallest box that contains the entire area. After this step, it may be desirable to let this AML to run over night (especially for large RFCs). Prepro1.aml uses the project command and requires the projection file geoalb83.prj to exist in the same directory. 3. If the RFC being processed borders large bodies of water (i.e. oceans or Great Lakes) it will likely be necessary to edit the river arcs along the coast. This is a manual procedure but it goes pretty quickly. The steps to do this are as follows: Load the the elevpmf Grid, projected RF1 Theme (rf1p), and the projected RFC boundary Theme (rfcbndp) into an ArcView View. These Themes were created by prepro1.aml. Load the ThreshR extension. All of the "rf1p" arcs point upstream rather than downstream. It is useful to flip the direction of the arcs for for step 6 below. To do this, make "rf1p" the active Theme and run ThreshR-Utility --> Flip Arcs. This program will ask you to save the flipped arcs as a new shapefile. Name the new file "rf1pf.shp." Attribute information is transferred to rf1pf.shp with the exception of some standard Arc/Info coverage attributes that are no longer needed in ArcView. For this attribute transfer to work, the seventh field in the Arc/Info "rf1p" file should be the Arc/Info User-ID field. This was the case for the ABRFC data. If this is not the case, then the ThreshR-Utility --> Flip Arcs script needs to be changed. Note: I don't know with 100% certainty that all arcs will be pointing upstream for all RFCs, so please check this before running this program. An easy way to check this is to changed the Legend Symbol used to a line with arrows rather than the default single line. If you zoom in, you will see more arrows. 4. For coastal areas, extend the length of rf1pf lines when necessary: This is done most easily doing using ArcView. Modify the legend for the elevation grid (elevpmf) so that zero values have a distinctive color from NODATA cells and a distinctive color for all cells greater than zero. Make NODATA cells transparent. The goal is to lengthen any rf1pf arcs that don't quite make it to the ocean. This is important for stream carving, especially if a portion of the landscape is assigned zero elevation. Add the RFC boundary Theme available from the NOHRSC web site because this provides a clearer definition of the coastline than the DEM and provides some guidance regarding the appropriate path along which a river arc should be extended. Sometimes it is also helpful to have an atlas handy to orient yourself. Pan and Zoom along the coast to make the required edits. It is much easier to add new arcs to rf1mod than it is to modify existing arcs. See ArcView documentaion on how to edit an arc Shapefile. 5. Run prepro2.aml. To run, follow the instructions in the header of the prepro2.aml file. This aml calls a C program called flatareas5.c (must be compiled). 6. While prepro2.aml is running, ArcView scripts can be run to identify potential problem points in the derived flow direction grid. Potential problem points occur when two separate stream arcs come within one 400 m grid cell of one another at a point that is far from a junction. This happens quite frequently when using the RF1 data in conjunction with a 15 arc-second DEM. The steps to do this are as follows: - Load the most recent ThreshR Extension file into ArcView
The goal here is to eliminate reaches that are composed of multiple arcs. To do this, a new attribute is added to rf1pf.shp that contains a unique value for each reach. (It turns out that this program works for reaches with two arcs but may not work with reaches for 3 arcs so the run may need to be repeated.) This attribute is used later with the Avenue "Summarize" request to merge all multi-arc shape files. The ID Multi-arc Reaches program adds an attribute to the rf1pf.shp called "merge." There is a record in rf1pf.shp for each arc. For any river reaches that are composed of multiple arcs, the merge attribute for each arc in that reach is assigned an common integer value with other arcs that make up that reach. For single arc reaches, the merge attribute is 0. (Took 51 minutes for MARFC) The program prompts the user to identify the river theme and a copy of the river theme. Note that the river theme is modified by this program. The original flipped coverage should have fewer records in its attribute table than the copy when the program is completed (unless all stream segments were originally only single arcs). - Add a new integer field to the rf1pf.shp file called merge1, select all zero records in "merge", and use the Calculate button to assign arcinf_id to selected records, toggle the selection and assign "merge" plus the maximum of arcinf_id to the selected records. Finally, run the program ThreshR--> Merge Multi-arc Reaches to merge lines with a common "merge1" id value. - Run ThreshR-Utility/Identify Potential Problems (Time estimate for ABRFC: 14 min ) This program asks the user to:
The program creates a grid with ones in the stream and zeros on the
land surface (I.e. "rivone1" this grid is useful for display purposes in
Step 7) and a point shapefile (i.e. "probzo1.shp") which identifies the
locations of potential problems. It is not known for sure if there
are any problems at these points until step 7.
7. Manually check and edit the flow direction grid derived by prepro2.aml. This is the point at which manual checking and editing of the flow direction
grid is required. There are "trick" methods for editing grids in ArcView,
but I have found these methods relatively slow and tedious. I've also found
it difficult to do editing quickly when
Before using editgrid.aml, the shapefile probzo1.shp must be converted to an Arc/Info point coverage and a grid which is the base2 logarithm (This is needed to use the griddirection command -- see Arc/Info Help for more explanation) of the flow direction grid must be created. Arc: shapearc probzo1 probzo1
The header of editgrid.aml must be changed to be consistent with the input file names being used. The procedure for editing grid values may appear tedious at first, but it becomes fairly easy once you get the hang of it. When the editgrid.aml program is started, the user is prompted to type a record number to start with. This means a record number in the problem points coverage. This option is helpful if you want to check a portion of the points, take a break and come back to check the rest. Before running editgrid.aml, type the command
For each potential problem point, gridded, streams are displayed in white, flow direction arrows are displayed in red and digitized (RF1) streams are displayed in blue. The potential problem points are labelled and numbered with green. The display automaticall defaults to an extent that is 10 cells on each side of the problem point in question. Since this might turn out to be too close to understand the overall flow picture, the user is first prompted with the question: "Resize window? (9 moves to edit options):" The user may type a number, representing the number of cells to define on each side of the cell extent. (20 is interesting, 50 is getting a bit big) Typing 9 moves the user to the next option: "Enter edit value (9 when finished):" The edit values to consider are (0 = E, 1=SE, 2=S, 3=SW, 4=W, 5=NW, 6=N, 7=NE). If you quit the program normally, your changes will be saved and the arrows you see on the screen will reflect your changes when you rerun the program. If you exit out of the AML due to a mis-typed key stroke and a subsequent error message, this is no big deal. Just restart the AML and start with the last record you had previously worked on. You should be able to scroll up in your terminal window to find the last record in the problem points shapefile that you looked at before exiting. If you exit out of AML due to an error, you may get the following message
when attempting to restart the AML.
Grid: cursor out_cur remove Sometimes a cluster of problem points will exist in the same area. Flow direction corrections only need to be made once for these areas. Part of the reason that this program is so inflexible is that AML is very quirky and it sometimes takes a while to debug. Avenue is far superior. (Good example of the types of problems encountered -- MARFC, point 40
Now that a good flow direction grid has been created, convert fdb2i back into the 1, 2, 4, 8, 16, . . . . codes that Arc/Info can understand. Grid: fdr = exp2 ( fdb2i )
Setting Up and Checking Regression Database Delivered by MTU To enable ThreshR Scripts work in all RFCs and make different types of computations depending on the location of interest, strict ThreshR database specifications must be met. Michigan Tech has collected and supplied nearly all of the required data sources; however, there are numerous checks and modifications that must be made to this database in order for the ThreshR software to run correctly. This is a setup cost that is necessary due to the complexity of the database, and the fact that the database specifications have evolved over time. There may have been some way to reduce this with more hands on interaction with the contractor. Up until now, I have not had time to document some of the more tedious issues that I have identified. The checks and modifications described here must be made for each RFC. Tasks required to complete ThreshR setup: - Compute flow accumulation from the edited flow direction grid (output
file should be named fa).
- Create a point file of HRAP center points called (hrappts.shp). This file should cover the entire RFC. The extension coord.avx can be used to create this file. With a View active, click on HRAP --> Create HRAP Center points with a View active and choose to create a new point coverage in geographic coordinates based on HRAP extent. Specifying HRAP extent requires the user to specify the lower left HRAPX and HRAPY and the number of columns and rows. To determine the appropriate geographic extent for an RFC, the program xmrgtoasc.c can be run on an existing gridded threshold runoff file. The location of the gridded runoff files for ABRFC on the HRL-NHDRs is /nonawips/usr/apps/ffg/files/abrfc/grro. More information on coord.avx and xmrgtoasc.c. - Rename "elevpm" (created by prepro1.aml) to "dem" using the Arc RENAME command. - Make sure the slope grid is available (not delivered with the ThreshR database in Dec. 1999). If not, derive a slope Grid using the ArcView Spatial Analyst function Surface--> Derive slope. Make sure the Analysis--> Properties extent and grid size are set to same as "dem" before doing this. The Derive Slope function returns slope in degrees. To get the slope in percent, use Analysis-->Map Calculator to take the Tangent of the slope grid. The ArcView "Tan" function assumes input units are radians, so degrees must be converted to radians before applying the Tan function. The name required for the slope grid by the ThreshR software is "slopeff". Determine the file source name for the output of the tangent Map Calculation by choosing Theme--> Properties. Copy the source Grid to "slopeff" by using the File--> Manage Data Sources. - (Keep this for now.) Make sure that all source file names conform to an 8.3 naming conventions. Although neither UNIX systems or modern PC operating systems (e.g. Windows 98) are limited to this filename length, it is desirable to store data on a CD-ROM media which can be read from multiple platforms. In my experience extrac characters in file names that don't conform to the 8.3. naming convention standard become garbled on our HP UNIX systems. Thus, to make the data files more universally usable, all data files used by ThreshR should conform to the DOS 8.3 naming convention. - The statekey.shp and regions.shp files were not delivered with
all of the correct fields. Look at the files from ABRFC or MBRFC
as models to add the correct fields to the attribute tables for these Shapefiles.
When the Region shapes and State shapes for different regions were combined,
some polygon slivers were created at the intersection of shape boundaries.
To clean up the database, these slivers should be merged with the approprate
larger neighbors. Take a look at the AML mergereg.aml under the CBRFC
directory to see how this can be easily done.
- Revise the parcode.txt and regequat.txt using the files from ABRFC and MBRFC as models. These files need to be converted to dBase format. The reason for this change is that dBase files can be edited more easily within ArcView. The original choice of using .txt files was because dBase files have limitations on field names; however, this can be alleviated in the future by using aliases within ArcView. Field names in the original parcode.txt and regequat.txt table designs may need to be shortened. It is probably easiest to make the field name changes first using a text editor (e.g. Nedit) and then convert the files to dBase format so that they can be edited directly within ArcView. In the new version of ThreshR, these files must be named "parcode.dbf" and "regequat.dbf. - Check all of the numbers in regequat.dbf against USGS Report 94-4002, and be sure to check the "Errata Sheet." The region numbers in regions.shp should be checked to see if they are consistent with the region numbers in regequat.dbf. - A critical field in parcode.dbf is the "conversion" field. This field is used to tell the program whether or not a units conversion and/or adjustment factor on the source data is required to calculate peak flows correctly. The codes "a","s","m", and "d" are used to indicate add, subtract, multiply and divide respectively. A number indicating a multiplicative or additive factor must follow each of these codes, separated by an underscore. Multiple operations can be made on the same parameter by separating operations with an underscore. Example: the conversion required when computing mean basin elevation in New Mexico (ELV_NM) is "m_3.281_d_1000" which means multiply the parameter value by 3.281 and divide by 1000. This converts mean basin elevation from meters to thousands of ft. The conversion codes are interpreted and applied by the script "threshr.qxparams." A checker using USGS Report 94-4002 must be careful because subtractions are sometimes reported in the text, but not reflected in the printed equations and sometimes they are reported in the printed equations but not stated in the text. - An important field in regequat.dbf is the "special" field. The "special" field contains the code 0 if the corresponding regression equations conform to the standard format (see Functionality description). For a particular state or region with a unique form of the regression equation, the special function code must be a unique integer not only within that RFC but across the entire United States. - For specialized data layers in individual states in vector format, convert all of the Arc/Info Coverages into Shapefiles and eliminate unnecessary fields. This is done because there is no need to keep two types of vector data in the ThreshR database. Make sure that there is a field in the Shapefile that contains the parameter code that is represented by that layer. For example, in Kansas, there is a polygon Shapefile that specifies soil permeability. The code for this attribute in parcode.dbf is SLPM_KS, so the field name containing the actual permeability values must have this name. For the case of Minnesota, where the parameter of interest is percent streams and ponds, the parameter code is STRP_MN. The field named strp_mn in the streams and ponds shapefile strp_mn.shp contains the value 100 indicating that any area inside these polygons is 100 percent water. - Several problems requiring fixes were found in setting up the database for ABRFC and MBRFC. To give some more examples of things to look for, here are some of the example problems found. parcode.txt was missing parameters for Wyoming and MontanaNotes on regression equations for individual states: Iowa: The Grid "stral_ia" is incorrect. As a temporary solution, baseline runs for Q2 are made using the regression equations from Carpenter and Georgakakos, 1993 (p. 20) is used. A linefile of first order streams in Iowa is needed.Region names are totally wrong in Wyoming |
Main Link Categories: Home | OHD | NWS |