National Weather Service United States Department of Commerce

RFC Operational PC Backup Project Report
Remote Backup Test for Loss of Facility
ABRFC - 11 March 2003

Summary

On 11 March 2003, ABRFC conducted a successful operational backup test that simulated conditions for a total loss of the RFC facility. The test was conducted by ABRFC personnel at a hotel in Tulsa, OK, using a portable laptop system. This test proves the utility of a practical, low cost backup strategy whereby an RFC provides for operational backup themselves in their home metropolitan area for any failure scenario using any facility with a dedicated Internet connection and voice phone line available for 24x7 NWS use. The laptop computer system ran the river forecast model 25 times faster than AWIPS (as measured in average CPU time). Similar performance improvements were noted across the board for other applications. From a cold start, the system was made ready for use by a forecaster in one hour and ten minutes. Cold start is defined as the system having no model files, no observed data, no files of estimated radar precipitation, no QPF files and not connected to the Internet. Ready for use is defined as a completed run of the entire river forecast system ready for interactive use by a forecaster and data ingest/dissemination running. Data ingest software was improved significantly from a previous ABRFC test of a desktop backup system in May 2002. The changes in data ingest software brought the receipt of observed height data (river stage, lake elevation, etc) to an acceptable level of 98.6% versus the May 2002 level of 67.2%. The test was totally successful as the laptop computer system was able to host operations and provide full-featured ABRFC forecast and guidance products to customers in a timely and transparent manner. The system ran totally independent of AWIPS. The hydromet situation was fairly benign during the test with only light precipitation noted in the ABRFC area of responsibility. Therefore, only a representative routine daily river forecast, flash flood guidance and hydromet discussion products were issued using the backup system.


System Configuration

Computations were performed on a single Dell laptop computer with 2.0 GHz CPU, 512 MB memory, Red Hat LINUX Version 7.2 and Informix for LINUX Version 7.31 connected through a four port router on a 100 Mb leg of a hotel LAN that was connected to the Internet with “near” T1 bandwidth. Three additional forecaster “seats” were available by connecting to the inexpensive router. ABRFC only tested one additional “seat” using an old 233 MHz, 32 MB memory laptop running Red Hat LINUX Version 7.0 and “connecting” to the Dell laptop via SSH login. Therefore only two persons were using the system at one time during this rather benign weather situation. The IHFS database is version 5.22 and NWSRFS is Release 22. Data retrieval is via Internet using programs developed by ABRFC. DPAs are retrieved via FTP from the NWSTG central product server (tgftp.nws.noaa.gov). Text products, such as HADS, COOP, and other SHEF products, are obtained from the SRH data server (www.srh.noaa.gov/data/...) by opening a raw socket connection, sending an HTTP GET request, and waiting for a response (i.e. a non-GUI Web client). Two new software programs were implemented for this test to improve overall availability of DCP data. These two programs run periodically and check the Informix database for missing data. If the programs detect missing data, they then attempt to access those data from the HADS server (dipper.nws.noaa.gov) or the USGS server (waterdata.usgs.gov/{state}/nwis/current...). Access to Mesonet data is direct-to-server via FTP. An ABRFC developed program converts raw DPA (rdar digital precipitation array) binary files from UNIX to LINUX (big endian/little endian problem) and provides the files to the process_dpa program where the standard nationally supplied radar precip processing software takes over. The standard ShefDecoder, OFS_DE and BatchPost routines are utililized. The P2 radar precip estimation software was ported and utilized. The standalone MPE software was tested on the laptop and functioned. MPE was not used operationally during the test as is the policy with ABRFC AWIPS operations. The ABRFC versions of xnav, xdat, xsets and fcst_prog were used for data display, quality control and product composition. D2D is not utilized. Radar reflectivity and satellite images are displayed by forecasters via Internet from their favorite web locations. QPF was processed using NMAP for LINUX. NMAP performed flawlessly. The national flood outlook product was not produced on the backup system since ArcView is unavailable for LINUX. Product dissemination was accomplished using ABRFC developed scripts which drop products on the SRH server where they are picked up by a backup office’s LDAD. Products to be disseminated are then ingested through the LDAD into AWIPS via the standard handleOUP/distributeProduct AWIPS software used to pull other information into AWIPS.

System Performance Summary

data graphic


System Maintenance Requirements

In order for the backup hardware system to be available during a total loss of facility scenario, it must be stored offsite in a secure manner. For economic and simplicity reasons, ABRFC chose to keep the backup system in a “powered down”, off-line state. Therefore, in order to cold start the system, one must have access to current/recent NWSRFS fs5files, Informix observed data, QPF and radar precipitation files. ABRFC provides access to these data by uploading appropriate files and data periodically to a Southern Region HQ server. Providing the latter three types of data are straight forward...one simply moves an appropriate period of data periodically from the RFC operational system to the server and one is then updated and ready to go. However, the fs5files are a different problem. The fs5files must be LINUX files. Therefore, the RFC must either be running NWSRFS on one of the AWIPS lx PCs or a stand-alone PC (such as is done at ABRFC) in order to have the files available in LINUX format. Due to inconsistencies in NWSRFS files, no one has been able to write a program that successfully converts all fs5files from HP-UX to LINUX. One other alterative (ABRFC has rejected for a number of reasons) is to cold start NWSRFS by defining the entire system from scratch (i.e., via segment defs, station defs, etc, etc)

It is extremely important that the backup system have its software updated periodically. For example, new releases of NWSRFS must be implemented, updates to local applications must be ported and etc.

The backup system must also be tested periodically. ABRFC is implementing a quarterly backup test schedule for the next year or two. This frequency was chosen due to the newness of the system and in order for all staff members to have the opportunity to gain experience and confidence using the backup system.


Concerns Identified During The Test

There were no serious problems identified during this test. All concerns can be corrected with proper checklists and new procedures. One concern surfaced at the start of the test and emphasized the importance of updating software throughout the system. When one goes into backup mode with this system, one must have another office monitor the Southern Region server using their LDAD in order to act on any product put there for dissemination to AWIPS. When the test began, ABRFC notified WGRFC that we were going into backup mode and they should turn on their LDAD program. WGRFC informed us they could not because they were performing an AWIPS upgrade that day. Therefore ABRFC used our own LDAD to serve the function for AWIPS dissemination. However, we later discovered that our LDAD program was an old version that had bugs and thus did not provide for dissemination of all the products sent to the server. A similar problem occurred when two programs that access hourly precipitation data were not run during the test because the system software was not updated.

One other concern is the low percent receipt (84.3%) of DPA products. The problem was the same as in the May 2002 test, i.e., the “tgftp.nws.noaa.gov” server sometimes too busy and refused our connection. ABRFC was unsuccessful with NWSHQ in obtaining a special login to the system. In order to address this problem, we have asked SRHQ to capture of DPA products from the SBN and to store them on their server.

Future Improvements

The current system does not provide for creation of; 1) ABRFC web graphic products (such as radar precipitation estimates), 2) the Southern Region River Flood Outlook (RFO) text and graphic products and 3) the national significant Flood Outlook product (FOP). The capability exists for graphics creation of products as indicated in item 1 above. If deemed necessary in the future, software can be ported to create these graphics directly on the backup laptop because they are not currently created with ArcView software. The RFO text product (cccESGxxx) can be ported as well because the gui and text product creator does not use ArcView. The FOP cannot be ported directly to the LINUX laptop because the gui uses ArcView. A MS-Windows machine would need to be integrated into the system in order to produce ArcView derived graphics. Another category of products that have not been tested is AHPS ESP-ADP images.

After all this has been said concerning possible improvements to the system...one must consider the basic requirements for backup are being met through production of river forecasts, flash flood guidance and other support products such as HMD/HCMs. The point is...it is a backup system at this time and not a replacement system for AWIPS.

Conclusions

This test proves the utility of a practical, low cost backup strategy whereby an RFC provides for operational backup themselves in their home metropolitan area for any failure scenario using any facility with a dedicated Internet connection and voice phone line available for 24x7 NWS use.