SENDFILE, GETFILE, GETFILE_EW, AND MAKEHBFILE

OVERVIEW

April 19, 2002

 

INTRODUCTION

The sendfile, getfile, getfile_ew, and makehbfile programs are used to transfer files between computers running Windows NT and/or Sun Solaris. The programs run continuously, so they are appropriate for real-time applications.

The sendfile program gets files from a "queue directory", transmits the files via a socket connection, and deletes the files from the queue directory. The getfile program receives data files from sendfile via a socket connection and saves them in specified directories. The makehbfile program periodically places a "heartbeat file" in the queue directory used by sendfile.

If the network connection between computers is down, for any reason, files will accumulate in sendfile's queue directory until the network comes back up. Then, all files will be sent immediately.

One getfile process can accept files from any number of sendfile processes running on different computers.

NOTE: Getfile, sendfile, and makehbfile are standalone programs which were developed by Will Kohler at USGS, Menlo Park. Getfile_ew took the getfile program and enhanced it to optionally run as an Earthworm module.

The standalone programs can be found as a zip in on the Earthworm ftp site in the 'contributed_software' directory. Getfile_ew is part of the Earthworm release.

WARNING: The sendfile and getfile programs do no data "translation". Ifbinary files are transferred, the data format and byte-order of the file may be incompatible with the receiving computer.

 

INSTALLATION and STARTUP of STANDALONE PROGRAMS

The zip distribution contains Windows and Solaris executable files for each program. Windows executable files are named <progname>.exe, and Solaris executable files are named <progname>. To install the programs, simply copy the executable files to any directory, which is in the user's search path.

The programs can also be run by typing the full path name. The name of the program configuration file is specified on the command line, eg:

sendfile sendfile.d or getfile getfile.d

On Solaris, the programs can be run in a window, or they can be started automatically at boot time. On Windows NT, the executable (or a shortcut to the executable) can be placed in the user's startup directory. See Windows documentation for details.

 

COMPILING THE PROGRAMS of STANDALONE PROGRAMS

If it's necessary to recompile a program, change the working directory to the program distribution directory. On a Windows NT system, type:

nmake /f makefile.win

On a Solaris system, type:

make -f makefile.sol

The programs were compiled using Microsoft Visual C++ 6.0 on Windows NT, and Sun C 4.2 on Solaris. All user functions are included in the distribution directories.

 

SENDFILE NOTES

The sendfile program requires a configuration file, typically named sendfile.d. A sample configuration file is included in the distribution directory and is shown below. Any line which begins with the # sign is interpreted as a comment, so feel free to add your own comment lines.

The output (queue) and log directories must exist before running the program. They should be created manually using the mkdir command (or the equivalent windows/file manager operation). In Solaris, the user must have write permission for these directories.

The TimeZone parameter is used only for time stamping entries in the log file.

#

# sendfile.d - Configuration file for the sendfile program

#

# ServerIP = IP address of computer running getfile

# ServerPort = Well known port of getfile program

# TimeOut = Send/receive will time out after TimeOut seconds.

# RetryInterval = If an error occurs, retry after this many seconds

# OutDir = Path of directory containing files to send to getfile

# LogFile = If 0, don't log to screen or disk.

# If 1, log to screen only.

# If 2, log to disk only.

# If 3, log to screen and disk.

# TimeZone = Timezone of computer system clock (ignored in Solaris)

# LogFileName = Full path name of log files. The current date will

# be appended to log file names.

#

-ServerIP 130.118.43.31

-ServerPort 3456

-TimeOut 15

-RetryInterval 60

-LogFile 3

-TimeZone GMT

# Solaris

# -------

-OutDir /home/earthworm/outdir

-LogFileName /home/earthworm/run/log/sendfile.log

# Windows

# -------

#-OutDir c:\earthworm\outdir

#-LogFileName c:\earthworm\run\log\sendfile.log

After the sendfile program is invoked, it patiently waits for files to appear in output queue directory. When it detects a file, it opens a socket connection to getfile, sends the file, and then deletes the file from the queue directory.

Under Windows NT, if a file is copied or moved to the queue directory using the "copy" or "move" command (recommended), sendfile will wait until the file is completely transferred to the queue directory before it is sent to getfile. This prevents a partial file from being sent. The Windows version also uses the _fsopen() system call which provides file locking. Sendfile waits until all other processes are done with a file before accessing it.

The Solaris version of sendfile works somewhat differently. If a file is cp'd to the queue directory, it's possible that sendfile might not transfer the whole file. So, I don't recommend using the cp command for this purpose. The preferred technique is to create a complete file in another directory in same disk partition as the queue directory. Then create a hard link to the queue directory using the command:

ln <file> <queue directory)

After the file has been transferred by sendfile, sendfile will remove the link, which leaves the original file intact.

An alternative is to create the file on another directory on the same disk partition and move it to the queue directory using the mv command. After the file has been transferred by sendfile, sendfile will erase the file.

 

GETFILE NOTES

The getfile program requires a configuration file, typically named getfile.d. A sample configuration file is included in the distribution directory and is shown below. Any line which begins with the # sign is interpreted as a comment, so feel free to add your own comment lines.

The input, temporary, and log directories must exist before running the program. They should be created manually using the mkdir command (or the equivalent windows/file manager operation). In Solaris, the user must have write permission for these directories.

The TimeZone parameter is used only for time stamping entries in the log file.

SAMPLE CONFIGURATION FILE:

#

# getfile.d - Configuration file for the getfile program

#

# ServerIP = IP address of this computer's host adapter

# ServerPort = Well-known port used by this program

# TimeOut = Send/receive will time out after TimeOut seconds.

# LogFile = If 0, don't log to screen or disk.

# If 1, log to screen only.

# If 2, log to disk only.

# If 3, log to screen and disk.

# TimeZone = Timezone of computer system clock (ignored in Solaris)

# LogFileName = Full path name of log files. The current date will

# be appended to log file names.

# TempDir = Full path name of directory to contain temporary files.

# This directory must be on the same partition as the

# client directories.

# Client = IP address of trusted client, followed by the path of

# directory to contain files sent by this client.

# Connections from all other IP addresses are rejected.

# The client directories must be on the same disk

# partition as is TempDir.

#

-ServerIP 130.118.43.31

-ServerPort 3456

-TimeOut 15

-LogFile 3

-TimeZone GMT

-LogFileName /home/earthworm/run/log/getfile.log

-TempDir /home/earthworm/indir

-Client 130.118.43.36 /home/earthworm/indir/billy

-Client 130.118.43.38 /home/earthworm/indir/campbell

END OF CONFIGURATION FILE

The getfile program accepts connections from any number of sendfile processes on remote systems. Only one socket connection at a time is allowed. If a sendfile process wants to connect while the getfile connection is busy, getfile will queue the connection request. Getfile writes data from the socket connection to a directory of temporary files. After the data transfer from the socket is complete, getfile moves the file from the temporary directory to the client directories. This ensures that all files in the client directores are complete.

 

GETFILE_EW NOTES

Getfile_ew performs the same functionality as getfile, but can run as an Earthworm module. The configuration file is the same except for the addition of the Earthworm specific commands:

#

# Optional Earthworm Setup

# If present, this module will attach to the OutRing and

# beat its heart as prescribed. Otherwise, the module

# works as a standalone program

#

MyModuleId MOD_ARC2TRIG # module id for this instance of arc2trig

OutRingName HYPO_RING # shared memory ring for output

EwLogSwitch 1 # 0 to turn off disk log file; 1 to turn it on

# 2 to log to module log but not to stderr/stdout

HeartBeatInterval 30 # seconds between heartbeats

 

MAKEHBFILE NOTES

The makehbfile program makes heartbeat files and places them in a specified directory. The output directory should be the same as the sendfile queue directory, and it must exist before running makehbfile.

The makehbfile program requires a configuration file, typically named makehbfile.d. A sample configuration file is included in the distribution directory and is shown below. Any line which begins with the # sign is interpreted as a comment, so feel free to add your own comment lines.

SAMPLE CONFIGURATION FILE:

#

# makehbfile.d - Configuration file for the makehbfile program

#

# Path = Name of directory to contain heartbeat files.

# HbName = Name of the heartbeat file (without path)

# Interval = Create a heartbeat file every "Interval" seconds

#

-HbName hb.hollyhock

-Interval 60

# Solaris

-Path /home/earthworm/outdir

# Windows

#-Path c:\earthworm\outdir

END OF CONFIGURATION FILE

All heartbeat files will be created with the same name, as specified in the makehbfile configuration file. If a heartbeat file already exists in the output directory, it will be overwritten by the most recent heartbeat file.

All heartbeat files contain a sequence of ASCII characters representing the computer system time, in seconds since midnight, Jan 1, 1970.

 

Please direct questions and bug reports to:

Will Kohler

U.S. Geological Survey

Mail Stop 977

345 Middlefield Rd

Menlo Park, CA 94025

phone: (650) 329-4761

email:

Questions? Issues? Subscribe to the Earthworm Google Groups List.