This is the latest version of Wave_server as of Earthworm version v7.0. It is based on the original Wave_server. It supports the SCNL protocol and process TYPE_TRACEBUF2 messages; it is not backward compatible to SCN.
Wave_serverV provides a network-based service of trace data. It acquires Earthworm trace data messages for specified channels, and maintains a disk-based circular buffer for each channel. It then offers a network service capable of supplying specified portions of trace data from specified channels. The basic features include:
* Up to 10 concurrent clients can be serviced. This limit may be changed to suit the available hardware resources.
* The size of the disk-based buffer is user-specified, up to 2 gigabytes per trace (approximately approximately 100 days, assuming 100 SPS, 16 bit data).
* The module serves either 'sanitized' trace data in ASCII format, suitable for 'casual' purposes such as visual displays, or raw binary data containing all information (and flaws) as received from the telemetry source.
* Several crash-recovery strategies are used to permit rapid restarts with minimal loss of data after catastrophic crashes, such as system errors or power loss.
* Extensible client-server protocol to permit additional queries to be easily implemented.
* A set of client routines are available to simplify implementing client applications.
* Handles interruptions of trace data without wasting disk space. Event data from highly intermittent sources of event data will not be overwritten by long periods of quiesence.
When wave_serverV first starts up, it reads the configuration file, and then starts a three-stage process of opening and checking tank and index files. In the first stage, it reads one of the `tank structure' files (an *.str file) if they exist. If no tank structure files are found, wave_serverV proceeds to the second stage, below.
Wave_serverV assumes that the tank structure file has the most up-to-date information about existing tank files (*.tnk) and their indexes (*.inx), especially tank file size, record size, and tank insertion position. This last crucial bit of information tells the `starting point' in the tank. Data just behind this point is the most recent information in the tank. Data after the insertion point is the oldest and will be overwritten as new data is added to the tank in circular fashion.
The server runs through the list of tanks from the structure file, verifying that each SCNL is listed in the config file. The one piece of information that wave_serverV takes from the config file in this stage is the index file size. Wave_serverV tries to open each the tank file and its index. If index files are missing or out of date, they are recreated by reading through the tank file. Depending on the amount of reconstruction needed for an index, this process may take several minutes for each tank. Note that there is no provision for checking the insertion point, read from the structure file, against the tank file.
If the tank and index files are read successfully, then that tank is marked as OK. If there are errors opening these files for a tank, then one of two things may happen. If ReCreateBadTanks is set, then new (empty) tank and index files are created using the information from the tank structure file. If ReCreateBadTanks is not set, then that tank is marked as BAD for later disposition. Once this loop has been completed for all the tanks listed in the structure file, stage one is complete.
For stage two of the startup sequence, wave_serverV scans the list of tanks in the config file. Any tanks that were not already found in the structure file will be created using the parameters listed in the config file. Any SCNLs added to the config file since wave_serverV was last run will be created now. If any errors are encountered creating new files here, that tank will be marked as BAD. Since the config file does not list the insertion point, any data that is in existing tank files for this SCNL will be effectively erased.
For the final stage of startup, wave_serverV goes through its internal list of tanks. Any that were marked as BAD previously now come to light. If PleaseContinue is set, wave_serverV will remove that tank from its internal list. Otherwise, wave_serverV will exit now.
Once this third pass is done, wave_serverV writes its internal list to the structure file and starts adding traces to their respective tanks. Each new packet of trace data causes the current index entry for that tank to be extended in time and file position. If there is a gap between the end of a tank and the new trace data (determined by GapThresh in config file), a new index entry is started. Wave_serverV periodically writes the index and structure list to disk, to save the latest information in case of wave server crashes.
WindowsNT: this shutdown mechanism can be invoked in only one way: you must give the Control-C interrupt to the console window where wave_serverV is running. DO NOT try using `restart <pid>', as this will terminate wave_serverV immediately without sending a signal to it. Once wave_serverV has terminated and its console window is gone, you can start a new instance of wave_serverV by using `restart <pid>' or by letting statmgr send a restart request. This last method requires that the `restartMe' command was set in wave_serverV's .desc file.
Solaris: (unix) wave_serverV can be terminated using the `kill <pid>' command. To stop and immediately restart wave_serverV, you can use `restart <pid>'.
With earlier versions of wave_serverV (Earthworm V5.0 and earlier) the only way to shut down wave_serverV is to shut down earthworm, using the pau command or entering "quit" at the startstop window.
Be sure you understand how to restart wave_serverV. If you inadvertently shut down wave_serverV without letting it go through its normal shutdown sequence, you risk doing damage to the tank, index and structure files. In the following discussion, restart means a quick but graceful shutdown and startup of wave_serverV, using the method appropriate for your platform and wave_serverV version. When some action must be taken between wave_serverV shutdown and startup, that will be spelled out.
IMPORTANT: Before taking any action below, shutting down wave_serverV first and then making a full backup of all tanks and TankStructFiles is recommended, if you can afford the disk space and the downtime to do it.
Also worth noting: the tank struct files may take prescedece over the .d configuration file, so if you make a change to your config file, and you appear to have problems, or your changes did not go into effect, try: Shut down wave_serverV; delete the TankStructFile and if you're using it, delete the TankStructFile2 and then start back up wave_serverV. The structure files should be recreated based on the .d configuration file.
How do I:
Three simple utilities are available to assist with wave_serverV problems. They basicly read the tank, index, and structure files and write out their contents in human-readable form.
inspect_tank [-g gap-size] rec-size tanksize tankfile-nameThe record size and tank file size should be exactly as they are given in the config file for that tank. If the -g option and a gap size are given they will be used to determine where gaps should be declared between trace data records. The default gap size is 1.5 sample intervals. By running inspect_tank repeatedly with different gapsize values, a profile of the gap history of the tank can be generated. Inspect_tank will prompt the user before writing a new index file; it selects the name that will not conflict with an existing index file name.
read_index [-g] indexfile-nameThe optional -g flag will change the output to a list of the gaps between index entries, giving the start time and length in seconds.
read_struct structurefile-name
* The server is synchronous, in the sense that it receives a request, issues the corresponding reply, and then processes the next request. Since there may be several server threads, more than one client can be connected and requesting data at one time. The <request id> was implemented to assist asynchronous clients, such as clients which are so structured that the code issuing the requests is tightly linked to the code processing the requests, and thus has trouble remembering which reply goes with which request. It is generated by the client, and is simply echoed by the server. The request id is echoed as a fixed length 12 character string, null padded on the right as required.
* A client establishes a connection to a wave server by requesting a TCP connection on an agreed-upon address and port number. The client may issue as many requests as desired once the connection is made. The client may close the connection at any time to terminate the interaction.
* <s><c><n> is short-hand for site code, channel code, network id, and location code. The format is four space-separated ASCII strings.
* <flags> is currently used by the server to indicate special conditions. Currently three flags are used, but additional flags can be added as needed. Formally, it is:
<flags>:: F | F<letter>...<letter>
That is, it consists of the letter F followed by zero or more letters. A space terminates the <flags>. The bare letter F by itself means that the requested data was returned; there may be gaps in the data but it is up to the client to detect those. Currently "FR", "FL", and "FG" are implemented to indicate that the request totally missed the tank. "FL" means that the requested interval was before anything in the tank; "FR" means the requested interval was after anything in the tank. "FG" is used to indicate that the requested interval fell wholy within a gap in the tank.
* <datatype> is a two character code ala CSS. Currently, only i2, i4, s2, and s4 are implemented. i means Intel byte order; s means Sparc byte order; 2 and 4 meaning two- four-byte signed integer.
* All times are given as ASCII representations of floating point seconds since 1970.
* Currently most of the following reqeusts and replies are handled by the ws_clientII routines that are included in the libsrc part of the Earthworm source tree.
<request id> pin# <s><c><n><l> <starttime> <endtime> <datatype> . . . . . pin# <s><c><n><l> <starttime> <endtime> <datatype> \n
<request id> <pin#> <s><c><n>
<request id> <pin#> <s><c><n>
<request id> <pin#> <s><c><n><l> F <datatype> <starttime> <sampling rate>
sample(1) sample(2)... sample(nsamples) <\n> {the samples are ASCII}
If the requested time is older than anything in the tank, the reply is:
<request id> <pin#> <s><c><n><l> FL <datatype>
For the case when the requested interval is younger than anything in the tank,
the reply is
<request id> <pin#> <s><c><n><l> FR <datatype> <youngest time in tank> <sampling rate> \n
A above, but for specified scnl name.
<request id> <pin#> <s><c><n><l> F <datatype> <starttime> <sampling-rate>
sample(1) sample(2)... sample(nsamples) <\n>
As above, but returns the trace data in the form in which it was circulating
within the system. The original trace data ("TYPE_TRACEBUF2" messages) spanning
the requested period will be returned in binary form. Only whole trace data
messages will be supplied, so that the actual <starttime> may be older than
requested, and the <endtime> may be younger than requested. The initial part
of the reply is part ASCII as above, terminated by a "\n", following that are
the binary "TYPE_TRACEBUF2" messages. The reply is terminated when the stated
number of binary bytes have been sent:
<request id> <pin#> <s><c><n><l> F <datatype> <starttime> <endtime> <bytes of binary data to follow> \n <trace_buf msg> ... <trace_buf msg>
If the requested time interval is older than anything in the tank, the reply is:
<request id> <pin#> <s><c><n><l> FL <datatype> <oldest time in tank> \n
If the requested time interval is older than anything in the tank,
the reply is:
<request id> <pin#> <s><c><n><l> FR <datatype> <youngest time in tank> \n
For the case when the requested interval falls completely within a data gap,
the reply is:
<request id> <pin#> <s><c><n><l> FG <datatype> \n
Kent Lindquist then produced an enhanced version, including the idea of segmenting
the tank into one partition for each trace. Since then, several authors were
involved in wave_server development: Alex Bittenbinder wrote the main thread;
Mac McKenzie wrote the parser of the client thread (server_thread.c); Eureka
Young wrote the server routines; Dave Kragness pretty much re-wrote the main
thread while implementing crash-recovery, and Pete Lombard produced the suite
of associated client routines.
MENUSCNL: <request id> <s><c><n><l>
As above, but returns the information for specified <s><c><n><l> name:
GETPIN: <request id> <pin#> <starttime> <endtime> <fill-value>
Returns the trace data for the specified <pin#> and requested time
interval. Any data gaps within the interval are filled with <fill-value>. Only
internal gaps will be filled. No fill will be provided for any requested data
which is either before or after the range of the available period (as stated
in the MENU reply). The reply data is represented in ASCII as blank-delimited
signed integers.
GETSCN: <request id> <s><c><n><l> <starttime> <endtime> <fill-value>
GETSCNLRAW: <request id> <s><c><n><l> <starttime> <endtime>
HISTORY
The original Wave_server module was written by Will Kohler in a rather
spectacularly short time: We came in on a Monday to find Will comatose, all
waste cans full of espresso cups, and a working wave server.
The motivation was to support the effort with the Alaska Geophysical Institute
to integrate Earthworm with DataScope and to provide a playback facility for
testing real-time algorithms. Lynn Dietz then proceeded to add numerous
enhancements.
It stores all trace data messages received, and servers all trace data
received during a specified time interval. It can thus be used to recreate
the the pattern of trace data messages inside an Earthworm system during a
specified period of time. This has proven to be valuable for testing and
debugging, and this module still exists as "wave_server".
TROUBLESHOOTING
From Lynn Dietz 8/07: The waveserver messages "Circular queue lapped : 3456 messages
lost." mean that waveserver is not writing data to its tanks fast enough. This
is an obvious source of gaps in your waveserver tanks. [To fix this you] can try
increasing "InputQueueLen". Also, 270 channels is a lot for one waveserver process
to handle. We usually try to limit each waveserver process to ~100 channels on
one disk dedicated to that process.
Contact: Questions? Issues? Subscribe to the Earthworm Google Groups List.