[FEP LOGO]  

FEP - Format Use by a Researcher - Eduardo Santiago - binary

Eduardo Santiago
Los Alamos National Laboratory
 
Comment on this template in the HyperNews Discussion.  

1. Format (Format System) Identification

raw binary

2. Original Motivation

Some projects distribute telemetry data streams in raw, binary form.

Although I have made this work, it has never been satisfactory.

3. Data Types

Processing Level: Level 0 and 1.

Object Types: Telemetry (containing Time Series, Multidimensional, Spectra).

4. Support

When receiving unformatted binary files (I never generate or distribute any), it is necessary to either (1) use code supplied by the organization distributing the data, or (2) write my own from specifications.

When dealing with supplied software, it has often been necessary to fix endianness-related bugs. We also have to trust that the developer included proper error checking.

When writing my own code, a detailed and correct specification of the data layout is critical. It is often surprisingly difficult to obtain this specification!

Furthermore, telemetry data streams can be interrupted or corrupted in an infinite variety of ways, making it necessary to write enormous amounts of error checking code, sync-word alignments, integrity checks, and so on. No level of support can possibly define all the failure modes.

5. Software

Hoo boy. I've written a good deal of code to read in all sorts of binary formats. Among the problems my code has to deal with are:

  • Making sure my code is endianness-independent, and will read a data stream correctly on any architecture.
  • Dealing with, and converting, VAX G_FLOAT or other byzantine non-IEEE floating-point representations.
  • Encountering bad checksums in telemetry packets, and/or losing packet synchronization in mid-stream.
  • Unexpected EOFs (similar to the previous).

6. Environment

UNIX-only (Linux, Solaris) with GNU tools.

7. Usage

Once I've written code to read in and handle raw binary files for a project, all incoming data files are immediately read in and their contents stored in HDF or CDF files. No user-visible code ever again accesses the raw binary files.

8. Experience

 >Relative to its ability to carry and manage research-needed metadata

Ugh. I wouldn't want to think about it.

 >Relative to its related software

No comparison. Its only strength is that it's easy for some bonehead to write a one-line FORTRAN statement to dump data. The resulting headache for the rest of the universe, of course, is then Somebody Else's Problem.

9. Desired Functionality

Nothing will ever make raw binary format palatable.

10. Selection Criteria

See my comments elsewhere.

11. Impact on Research

See my comments elsewhere.

Raw binary format is particularly cumbersome to work with, because it requires a significant investment -- order of weeks -- to develop robust code that will read data files.

12. Other Comments

Granted, raw binary is pretty much your only bet for a telemetry data stream. Nobody can argue otherwise.

However, it really should not make it past the ground station. If N people (teams, institutions, whatever) have to write code to interpret the data stream, deal with errors, and so on, there will be N different interpretations across the planet. Bad news.

The proper thing to do is have one team process the telemetry, save it in a robust file format (whatever that may be), and distribute that as a product.

Comment on this template in the HyperNews Discussion.

 

Wider Views

Formats Evolution Process (FEP) Discussion Forums Page
Formats Evolution Process (FEP) Home Page
NASA/Science Office of Standards and Technology (NOST) Home Page

URL: http://ssdoo.gsfc.nasa.gov/nost/fep/researcher-santiago-binary.html

A service of NOST at NSSDC.
Access statistics for this web are available.
Comments and suggestions are always welcome.

Author: Eduardo Santiago / Los Alamos National Laboratory / Deep Space One/PEPE, LENA-P (esm@lanl.gov) +1 505/665-3130
Curator: John Garrett (John.Garrett@gsfc.nasa.gov) +1.301.286.3575
NASA Official: Code 633.2 / Don Sawyer (Don.Sawyer@gsfc.nasa.gov) +1.301.286.2748
Last Revised: 1999-12-15 T19:32:23, Eduardo Santiago