How the weather databases work

Introduction

This page covers issues related to the primary database of weather data held on the PC hard drive. Normally, correct functioning of the database should take place happily and automatically during established routine operation of the AWS system and therefore require little understanding by the user. However, an insight into how the databases work can prove invaluable in troubleshooting problems and anomalies at times when the normal routine may have broken down or when the data logging (archive) interval has been deliberately changed. Understanding database operation can also provide a useful background to help in choosing between the different software packages and in appreciating their different modes of operation..

The Davis Weatherlink primary database

One of the key benefits of linking a PC to a weather station is that a detailed long term archive of weather data can be laid down on the PC hard drive. Each weather station package creates such a primary database in its own individual format which is then available for use as the source of data for archival data, graphs, text reports and statistical analysis. There is in fact no bar in principle to establishing a database with a time resolution of one minute and potentially covering many years. Of course in practice the amount of hard disk space required to store such a detailed archive can become an concern, although far from an overwhelming one given the large size of modern hard drives. Also retrieving arbitrary items of specific data from such relatively large (for a PC at least) database files can lead to sometimes significant performance issues. 

The writers of commercial software sometimes take measures to mitigate such performance problems. Thus, the Davis Weatherlink software creates a new database file for each new calendar month. Under this approach and assuming a length of 24 bytes per record and a minimum archive interval of 1 minute this will create a database file of a little over 1 MB per month, which is perfectly manageable. (In practice a time resolution of one minute is unnecessarily short - there is little real need ever to inspect weather data ranging over many hours and days to a precision of one minute. An archive interval of between 5 and 30 minutes should be perfectly adequate for all practical purposes). Using the Davis approach of multiple monthly files also simplifies the problem of backing up the hard drive's data archive. Several database files with a time resolution of 5 minutes or greater will fit even on a single floppy disk. The only potential disadvantage to the Davis methodology is that reviewing data spanning more than one month requires data to be pulled from more than one database file. However the Davis software seems to be able to cope quite happily with this requirement and can graph and export data across more than a single month.

The storage format of the Weatherlink database is a proprietary packed binary format, but the Browse window provides a powerful database editor with which the stored data can readily be viewed and edited.

Real-time vs archive data and other time-related issues

The primary database is of course a large set of observations of weather data vs time. How software packages organise records by time within their primary database varies from one package to another. In the Davis Weatherlink software (and also in VWS) time is treated simply as another parameter and records are added sequentially to the database as they arrive. Each record will contain a time stamp either generated by the weather station and contained within the logged record when downloaded from the weather station, or assigned by the PC at the time of logging. Thus if there is a day's gap in weather observations because, for example, of a fault in the weather monitor, the next available record is added immediately after the last logged record in this type of database and the two adjacent records will have time stamps a day apart. In contrast, WV32 creates a large database structure with a constant time resolution of one minute, irrespective of data logging time intervals which may have been set and mindless of the fact that one minute is an unnecessarily frequent time interval.. The time stamp of each new record (either logged or assigned) is inspected and the data stored in a unique location in the database corresponding to that exact time (to the nearest minute). If data has been logged at eg 10 minute intervals then only every tenth location in WV32 will contain data and the remaining locations will be empty.

One fundamental time-related issue that all packages have to deal with concerns the differences in data availability between buffer-linked and direct-linked weather stations. With buffer-linked systems - mandatory with makes like Davis - there is a choice between using the buffered ('archive') data and using direct ('real-time') data which is generated by the weather console moment by moment and which can effectively pass straight through the buffer module to the PC as and when requested by the PC. Direct-linked systems (ie a direct unbuffered serial link from weather station to PC) can of course only work with real-time data because they have no mechanism to store archive data within the weather console. There can be important consequences in using these different types of data, which vary with the make of station and according to the make-up of the data packets. Thus Davis archive data packets contain an internal time stamp which unambiguously defines the weather station time when the data packet was created. But Davis real-time packets (generated by the Davis Loop command, which, for anyone sufficiently interested, is explained in some supplementary documentation available from the Davis web site) have no integral time stamp and have to be time-labelled by the PC clock and not by the console clock - occasionally an important distinction if the two clocks are not correctly synchronised. For Davis stations the two sorts of  data are both usually used but effectively for different purposes. The archive data is reliably available irrespective of whether the PC is switched on or not, but only updated as frequently as the archive interval setting  - typically 5 to 30 minutes. While this is often enough data for long term database purposes, users may wish a 'real-time' display of current weather conditions to update more frequently, for example at least every minute. Therefore Davis use archive data as the source of data for the primary database and real-time data to update displays of current weather. The Weatherlink software makes no provision to mix the two classes of data packet, ie real-time data is never logged to the primary database.

Davis archive data can be logged to the Weatherlink primary database in one of three modes. First, when a download of archive records is requested manually. Second, automatically to a schedule which can be set within Weatherlink and can be as frequent as hourly. Thirdly, if the Weatherlink Stripchart feature (described elsewhere) is in auto-update mode then, as each new archive record appears in the data buffer, it is downloaded into the database. Since the archive period can be set as short as 1 minute, it is possible to have the primary database automatically updated every minute.

VWS - Operation of the Primary Database

VWS and WV32 adopt a different strategy to Weatherlink when logging data to their primary database, probably because they need to work readily with a range of direct-linked systems from different makers in addition to the Davis buffered system. Both VWS and WV32 do therefore log real-time data to the primary database. Both can also log Davis archive data and so can cater for extended periods when the PC may not be actively linked to a Davis weather station, but, in contrast to the Weatherlink software, this handling of archive data is effectively an additional feature rather than the primary mode of database operation.

As noted above, the VWS primary database stores data as it is received in sequential records in a single database file called dbase.bin. Though described as a binary file this is in fact a standard CSV text file with end of each record denoted by CR/LF. VWS seems to limit this file to a maximum of 30000 records. At minute record intervals, this is sufficient for less than 3 weeks worth of data, though obviously much longer at the more conventional 5-30 minute record intervals. It seems that when the current dbase.bin is approaching its maximum limit, a new one must be created presumably by moving the existing version to a new location and/or renaming it appropriately and allowing VWS to create a new copy in the default location. This is not a very elegant solution even though it may be needed at only infrequent intervals. It seems unlikely that data can be extracted eg for graphing from more than single copy of dbase.bin at any one time, which may be occasionally frustrating when the time boundary between two database files is of interest. VWS also does not currently have a specific tool to browse, edit and export from the primary database files, although because of the straightforward CSV format of the files this is not too difficult to achieve manually with the aid of a suitable text editor or spreadsheet package. 

It is important to note that VWS also has the facility, when used with Davis stations, to maintain the standard Weatherlink monthly databases, assuming that the full  Weatherlink software is also installed. If this option is used routinely, then there is much less concern about limitations of the VWS native database format. The VWS database can be viewed more as a source of current and recent weather data to service the main VWS display screens, with the Weatherlink primary database being used by the Weatherlink graphing and summary utilities for retrospective analysis of data.

In the real-time acquisition mode of VWS, the software requests new real-time data packets every 3 seconds via use of the Weatherlink Loop command. (Technically, the Loop command is sent directly to the serial port, rather than use weatherlink.dll - a low level software library which is available from Davis and installed by VWS to ease the task of communicating with the Weatherlink hardware, but not used for real-time updates for reasons of efficiency). All display objects are updated to this 3 second timing, other than the graph element, which reads data only from the primary database and which therefore updates only when the database is updated. The database update frequency is set by the Graph, Database and Interval Timer (GDIT) parameter in the Display Settings screen of VWS, which is sensibly recommended to be set to the same time period as the Weatherlink Archive Period, though there is apparently no mechanism to force the two to be identical. 

This real-time acquisition mode is the only one available for non-buffered stations such as the Peet and Oregon models, but for buffered stations VWS can additionally download archive data from the Weatherlink module. VWS in fact provides two distinct archive download modes, although both can be triggered within one single process. In the default mode, VWS calls and transfers temporary control to the main Weatherlink executable, pclink4.exe. This downloads the archive data into the primary Weatherlink database, clears the Weatherlink archive memory if required and returns control to VWS. In an optional additional mode, VWS will also, making use of the low-level Weatherlink routine weatherlink.dll, download the archive data into the VWS primary database.

WV32 - Operation of the Primary Database

Some key aspects of the WV32 database have been covered in the preceding notes and are not repeated here.

WV32 can acquire both archive and real-time data from Davis stations, but makes a fundamental distinction in their use, depending on whether the Real Time Monitoring mode is active or not. When RTM is active, which will usually be the case when WV32 is running since it is the primary display mode for current weather data, archive data storage by the Weatherlink hardware module is switched off (in order to improve update performance of the RTM screens) and all data is acquired by reading the memory locations of the Weatherlink hardware directly. This acquisition cycle is performed once per minute for most parameters, but more frequently for wind speed and direction. After each new acquisition cycle, a new data record is written to the minute-resolution primary database.

If the RTM mode of WV32 is not in use (PC off or WV32 not loaded or the RTM mode not active), then the Weatherlink hardware will be storing archive data in its usual way. When the RTM mode of WV32 is next entered, the accumulated archive records will be downloaded (by direct communication with the hardware, rather than eg by use of weatherlink.dll) and, depending on a flag setting, the archive memory may be cleared. Archive storage by the Weatherlink module is then suspended, but reactivated when the RTM mode is exited.