This issue of the NASA/Science Office of Standards and Technology (NOST) News focuses on metadata registries and in particular the recent Joint Workshop on Metadata Registries that was held at the Berkeley National Laboratory (U.S. Department of Energy) in Berkeley, California, on July 8-11, 1997. This article will attempt to give the reader some perspectives on the evolving concepts of metadata registries and their growing importance to information management and processing. This article will also highlight their potential value to the NASA science enterprises.
While the phrase "metadata registry" appears relatively new, there are at least three prime examples of existing metadata registries within the NASA environment. One is the Control Authority office (operated by NOST in accordance with Consulative Committee for Space Data Systems [CCSDS]/International Standards Organization [ISO] standards) at the NSSDC, together with its sibling offices at the Jet Propulsion Laboratory (JPL) and within the Upper Atmosphere Research Satallite (UARS) project, which register descriptions of types of data products and objects to improve their understanding and long-term usage. Another is the NASA Master Directory, which registers summary attributes of data products to improve their accessibility to diverse communities. The third is the Planetary Data System (PDS) Data Dictionary, which registers definitions of data elements to support a consistent set of data product semantics for increased automation and long-term understanding. Each of these has a different scope and a different degree of formalism in the procedures that govern registry operations.
Given this background, the announcement of a workshop devoted to metadata registries was of considerable interest to NOST personnel. Staff viewed this primarily as an opportunity to find out what others outside the agency were planning and implementing, and staff were not disappointed.
Briefly, the Metadata Registries Workshop was precipitated by the desire of ISO to improve the reuse of data elements used in the definition of various standards. It was recognized that many standards groups were reinventing such data elements in unnecessarily unique ways, and this was making it difficult to relate the standards and improve interoperability of implementations. However, the workshop sponsors (U.S. Environmental Protection Agency, Online Computer Library Center, and the Metadata Coalition) recognized that metadata registration had a much broader constituency, and so the workshop announcement solicited and obtained broad participation. Not only were many ISO and American National Standards Institute (ANSI) committees represented including ISO JCT1, TC 20, X3L8, X3T2, Standard for the Exchange of Product Model Data (STEP)/Product Data Exchange Using STEP (PDES), and Computer Aided Software and Systems Engineering (CASE) Data Interchange Format (CDIF), but there were representatives from digital libraries, academia, tool vendors, and Internet Web standards development. A number of demonstrations were given, and there were special presentations on related topics such as Common Objects Request Broker Architecture (CORBA) and both Microsoft's and UNISYS's repository technologies.
A complete description of workshop activities and topics discussed is beyond the scope of this article. However, more information can be obtained starting from the workshop home page. In addition, there is a draft Workshop Report available. The remainder of this article will provide some highlights from the report, offer some perspectives for the future, and highlight the potential value to the science enterprises.
The Metadata Registries Workshop Report provides a number of general findings, principles, resolutions, and advice.
A metadata registry is said to be a "formal system that records the semantics, structure, and interchange formats of any type of data" (e.g., data found in data bases, messages, documents, and other applications). Also, one "essential characteristic is the existence of a formal Authority agency that manages development and evolution of the registry, and is responsible for policies pertaining to contents and operation of the registry."
"A stated principle is that it is desirable that metadata elements sets be registered in a formal metadata registry according to explicit publicly available standards for specifying metadata elements." A specific example, much in evidence at the workshop, is ISO 11179. More will be said about this below in the perspectives for the future.
As additional principles it is stated that an active working relationship should be established among the groups that are currently standardizing metadata exchange; that registry standards should be aligned with Web standards such as the emerging Resource Description Framework; that interfaces to registries should be developed that accommodate multiple purposes; that metadata registries need shared, extensible categorization schemes for their metadata; and that all means possible should be used to encourage the active participation and integration of all efforts addressing metadata registries.
The report also recommends that there be further standardization in the operations of metadata registries, that a group should address the identification and analysis of existing categorization schemes for metadata, and that interested parties should work together to establish a Web site at which information on metadata frameworks and efforts may be collected and presented.
Perspectives on the Future
It seems clear that metadata registries will continue to grow in importance and serve a variety of functions that are all grounded in promoting reuse of the metadata they contain. Further, the evolving Web services will be a primary vehicle for metadata finding and exchange and therefore will play a significant role in shaping many metadata-related standards.
One type of metadata registry that NOST believes can significantly reduce data design and data handling costs while increasing the science return from NASA missions is the registration of data elements. Within the science disciplines the PDS has been the clear leader in adopting data element definition and registration. Outside NASA the medical community has been very active in developing a common Data Element Dictionary (DED). With the emergence of ISO 11179, which provides an internationally recognized framework for specifying the definitions of data elements including allowed values and ranges, other agencies are beginning to require that their data suppliers conform to their data element definitions. Within the international space communities the CCSDS panel 2 has been attempting to encourage broad DED usage by the science and engineering disciplines through its work on the draft Data Element Dictionary Specification Language (DEDSL).
Potential Value to the NASA Science Enterprises
NOST feels that the time is right for the NASA science enterprises to make a coordinated effort to embrace data element definition and registration. ISO 11179 should be evaluated, along with the CCSDS/ISO draft DEDSL and existing DED implementations such as the PDS DED and the Federal Geographic Data Committee content standard, to reach both cross-enterprise and within-enterprise recommendations on how to proceed with more effective DED-based approaches to data engineering and data handling.
The role of metadata registries is going to increase significantly in the next few years. They will play a variety of roles and should be more seriously considered across the enterprises.
Readers who agree and have approaches that could aid this evolution or those who have concerns or disagreements with these views should contact the staff at the NOST office.NOST GSFC Code 633.2 Greenbelt, MD 20771
Telephone: John Garrett, +1.301.286.3575
Telephone: Donald Sawyer, +1.301.286.2748
Erin D. Gardner, firstname.lastname@example.org, (301) 286-0163
Hughes STX, Code 633, NASA Goddard Space Flight Center
Greenbelt, MD 20771, U.S.A.