[FEP LOGO]  

FEP - Format Use by an Archive - HEASARC - FITS

Thomas McGlynn
High Energy Science Archive Research Center (HEASARC), NASA/GSFC
 
Comment on this template in the HyperNews Discussion.  

1. Archive Identification

The HEASARC is the primary NASA archive for high energy astronomy data and includes more than 1 TB of data from over a dozen distinct missions.

1.1 Nature of Ingest Activity

There are two typical ingest modes for the HEASARC archive. For current missions, data is pushed into the archive by various mission groups with distinct schedules for each active mission (case 1 above).

For older missions, the HEASARC internally reprocesses data into modern formats and releases it into its public archive when reprocessing is completed. Since the HEASARC is itself the source of the information this may be considered to be case 2.

1.2 Nature of User Community

The intent of the HEASARC is to ensure the long-term viability of the data and to make the data as independently usable as feasible. The HEASARC has worked to build community standards for high-energy astronomy data so that the contents of the HEASARC archive will be immediately usable by the astronomy community.

1.3 Nature of Archive

The HEASARC is intended as a permanent archive.

2. Format (Format System) Identification

Flexible Image Transport System (FITS)

The FITS format definition is available at http://archive.stsci.edu/fits/fits_standard.

3. Format Selection Rationale

FITS has been adopted as a standard by the International Astronomical Union and is virtually universally readable by astronomical data processing systems. Usage of FITS was mandated by NASA for appropriate astronomy data. Thus it was clear that FITS would be used as our primary format for dissemination of data.

Initially there was substantial concern that the FITS format would be inefficient for internal HEASARC usage (in particular because the HEASARC used little-endian hardware and FITS uses a big-endian binary representation), however measurements of the overhead showed that conversion overheads were generally insignificant. In practice the use of a format with no architecture dependencies has been a tremendous advantage in allowing our diverse community to interact with the HEASARC.

4. Roles of Format

FITS is used exclusively throughout all stages of our archive with minor exceptions for occasional ASCII files and GIF preview images.

1. Submission:

Data from current missions is normally provided in FITS or is immediately transformed using appropriate tools. Conversion of data to FITS is a major element of preparing data from older missions for submission into the archive.

FITS metadata is frequently used as the source of information to update HEASARC catalogs.

2. Long term storage

The stability and architecture-independence of FITS were major drivers in adopting FITS as an internal format as well as the primary format for disemination of data.

FITS does not have internal support for compression. The HEASARC archive uses .Z and .gz compressed files but this means that the FITS metadata is not accessible without decompression.

3. Dissemination and reprocessing

The HEASARC uses FITS for dissemination and all internal processing of the archive. The ability to use the same binary files on an evolving and heterogeneous network is crucial.

5. Data Structures Supported

FITS has three basic formats:

  • Multidimensional arrays of a variety of primitive types
  • ASCII tables
  • Binary tables.

For binary tables the contents of a given column can be a multidimensional, or variable length, array.

Recursive structures are not supported.

The HEASARC uses all these formats but the binary table format is the most flexible and is used for event, spectral, timing and other data types. Basic images are frequently stored using the simpler multi-dimensional array format.

6. Support

While FITS provides a syntax for writing metadata, the semantics of the metadata are largely unspecified. The HEASARC has worked with a number of high-energy astronomy institutions to build a concensus on the specific keywords and vocabulary to use to describe FITS data.

The HEASARC has been a primary developer of resources for reading and writing FITS data. The FITSIO library developed by W.Pence at the HEASARC is widely adopted library for reading FITS in C and Fortran programs. Other libraries have been developed at the HEASARC to use FITS in IDL and Java.

The astronomical community as a whole has been extremely supportive of the HEASARCs FITS efforts.

7. Software

Internally the HEASARC uses it's FTOOLS packages for processing FITS files for archive ingest.

Some missions use IDL to generate FITS images for archive ingest.

Most astronomical software packages -- and even a few generic image display packages -- support FITS though some may not support the full range of FITS formats.

8. Desired Functions

Three major issues that FITS fails to address adequately are: specification of metadata semantics, support for compression, and an agreed standard for how one FITS file might refer to another. A proposal for the last has been made but it is unclear how successful it will be.

FITS is also a rather old standard and was explicitly designed to support transport by magnetic tape. It has a number of archaic restrictions. While these can be gotten around -- and needn't be visible to the high-level software -- they are irksome, e.g., the need to pad FITS elements to 2880 byte boundaries, and the 8 character limit on FITS keywords.

9. Other Comments

Comment on this template in the HyperNews Discussion.

 

Wider Views

Formats Evolution Process (FEP) Discussion Forums Page
Formats Evolution Process (FEP) Home Page
NASA/Science Office of Standards and Technology (NOST) Home Page

URL: http://ssdoo.gsfc.nasa.gov/nost/fep/archive-heasarc-fits.html

A service of NOST at NSSDC.
Access statistics for this web are available.
Comments and suggestions are always welcome.

Author: Thomas McGlynn / High Energy Science Archive Research Center (HEASARC), NASA/GSFC / (tam@silk.gsfc.nasa.gov) 301-286-7743
Curator: John Garrett (John.Garrett@gsfc.nasa.gov) +1.301.286.3575
NASA Official: Code 633.2 / Don Sawyer (Don.Sawyer@gsfc.nasa.gov) +1.301.286.2748
Last Revised: 1999-06-29T15:06:05, Thomas McGlynn (1999-08-04, John Garrett)