[FEP LOGO]  

FEP - Format Developer - XDF

Ed Shaya
ADC/GSFC/NASA/RITSS
ADC Archive
 
Comment on this template in the HyperNews Discussion.  

1. History and Philosophy

The XDF was developed in a research project to study how converting the repository of the Astronomical Data Center to XML would improve the archive and searching processes. The ADC contains thousands of published tables, images, and spectra as well as detailed metadata on publication histories. Part of the ADC website deals with visualization and browse facilities. It was a key issue to provide sufficient standards in the XML data format to allow an application to display data with axis and/or coordinate information. Another key issue was to allow for complex data structures so that all of the information regarding a single project could be logically connected.

1.1 Identification

eXtensible Data Format

1.2 Purpose

A simple XML based format for scientific data.

1.3 User Community and Sponsoring Organization

At this time it is only in use at the ADC, but it has been introduced to other astronomical data centers through the ADCCC (Astrophysics Data Coordinating Council) and to the Earth Sciences at GSFC.

1.4 Format Evolution

This is not yet determined, but initially there will be discussion groups at GSFC (Codes 600 and 900 and the ADCCC). Breakout groups at the annual ADASS meeting are also expected.

2. Conceptual Model

The central object is the array which contains its data plus complete axes information. Several related arrays can be contained by a structure. The values that are attached with each step along a given dimension can be explicitly stated, contained within a file that is pointed to by an entity reference, or calculated from a functional form.

Information is given on the ordering of the numbers in the data block by read statements. This reduces any ambiguities of storage order.

The data blocks can be internally specified or a single external file or multiple external files. The data can also be calculated by expressing a function in standard Java or ECMAscript (Javascript).

The document definition types are given in: http://tarantella.gsfc.nasa.gov/xml/XDF_DTD.txt

3. Format Details

  • Relation to Hardware and Media Portability

    An advantage of XML is that it is widely portable.

  • Primitive data types supported

    It is hoped that XDF can be used to wrap a very wide set of data files.

    It supports text: fixedwidthfields, XHTML markup, and delimited data with arbitrary delimiter.

    It supports binary external data files:

    • Big Endian and Small Endian
    • type = (boolean | string | integer | float)
    • signed = (yes | no)
    • no of bits = (1 | 2 | 4 | 8 | 16 | 32 | 64 | 80 | 128)

  • Describe how software recognizes that it is working with data in this format - e.g. Magic numbers

    No scheme has been worked out for this, other than standard XML validation.

4. Uses

It is particularly suited to long term archival because of XML self description. It is also well suited for science level data (scalar, vector fields, tables, animations, imaging-spectroscopy). It is updateable because it is text editable. The XDF is Extendable through XML DTD or Schema enlargement.

5. Format Developer Software

The ADC is generating tables in XDF by extracting data from the publisher SGML meant for journals or textbooks. Any XML editor can be used to create XDF. In fact, any editor can be used to create or modify XDF.

6. Software Standard Features

XML is perhaps the most accepted world wide data mark up standard.

7. Non-developer Software

8. User Support

User support at this time is provided by the ADC for free.

9. Work In Progress

Java tools to read XDF and display slices through the data will be available very soon.

10. Evolution Plans

11. Documentation and Related References

All can be seen from http://tarantella.gsfc.nasa.gov/xml

A white paper on XDF, eXtensible Data Format DTD: http://tarantella.gsfc.nasa.gov/xml/XDFwhite.txt

The eXtensible Data Format DTD: http://tarantella.gsfc.nasa.gov/xml/XDF_DTD.txt

A tree view of the XDF DTD.
http://tarantella.gsfc.nasa.gov/xml/XDFhtml/DTD-TREE.html

A sample of the XDF data format. http://tarantella.gsfc.nasa.gov/xml/XDF_sample.xml

This one uses parsed and non-parsed entities for the data files. http://tarantella.gsfc.nasa.gov/xml/XDF_sample2.xml

This sample uses the ADC dataset.dtd and contains XDF for the tables. http://tarantella.gsfc.nasa.gov/xml/1005.xml

12. Other Comments

Comment on this template in the HyperNews Discussion.

 

Wider Views

Formats Evolution Process (FEP) Discussion Forums Page
Formats Evolution Process (FEP) Home Page
NASA/Science Office of Standards and Technology (NOST) Home Page

URL: http://ssdoo.gsfc.nasa.gov/nost/fep/developer-xdf.html

A service of NOST at NSSDC.
Access statistics for this web are available.
Comments and suggestions are always welcome.

Author: Ed Shaya (edward.j.shaya.1@gsfc.nasa.gov) +1.301.286.6044
Curator: John Garrett (John.Garrett@gsfc.nasa.gov) +1.301.286.3575
NASA Official: Code 633.2 / Don Sawyer (Don.Sawyer@gsfc.nasa.gov) +1.301.286.2748
Last Revised: 2000-03-02, Ed Shaya (2000-03-10) John Garrett