XML at ADC: Steps to a Next-Generation Data Archive

Volume 15, Number 1, March-June 1999

By Edward Shaya

The eXtensible Markup Language (XML) is a document markup language for the creation of hierarchical information structures. It allows the document type creator to specify requirements on the document's content and provide choices for attributes of the contents. It supports automatic checking of documents for structural validity. XML is supported by nearly every major corporate software developer worldwide. A rich array of tools is now available to help process and display XML documents. In addition, a whole new class of scripting languages is being written in XML to make programming much easier. These include languages for building GUIs, creating Web forms, interfacing with data bases, examining decision trees, describing Web sites, etc. XML was featured in the May 1999 issue of Scientific American.

The SSDOO's Astronomical Data Center (ADC, http://adc.gsfc.nasa.gov) is developing an XML tool box for importation, enhancement, and distribution of data and metadata documents written in XML. Work has begun on a Document Type Definition (DTD, at http://glissando.gsfc.nasa.gov/xml/acatalog.dtd) that specifies the elements of content and their attributes in ADC metadata documents. This project attempts to define both the metadata of an astronomical catalog and the XML format for an astronomical table.

The ADC is actively creating designs for the flow of data through automated pipelines from authors and journal presses into the XML archive as well as data retrieval through the Web via the XML Query Language. The documentation for each data set will be viewable in several different styles via eXtensible Style Language (XSL) scripts.

The legacy data in plain ASCII format is being converted to XML by describing the details of the format with a new scripting language that was developed at the ADC, to XML. A processor then parses the to XML commands and performs the generalized transformation.

When completed, the catalogs and journal tables at the ADC repository will be tightly hyperlinked to enhance data discovery. In addition, one will be able to search on any combination of metadata elements to return the appropriate catalog name or table name or data value, depending on the specifics of the query.

Return to NSSDC News Table of Contents

NASA home page GSFC home page GSFC organizational page

Editor:Miranda Beall
Programmer: Erin Gardner
Responsible Official: Dr. Joseph H. King, Code 633
Last Revised: 11 June 1999 [LAP]
Page Activity since October 29, 1998: 273