Archival Workshop on Ingest, Identification, and Certification Standards (AWIICS) | ||
|
Draft ReportArchival Workshop on Ingest, Identification, and Certification Standards (AWIICS) DATE: October 13-15, 1999 HOST: The National Archives and Records Administration Archives II 8601 Adelphi Road College Park, MD 20740-6001
Executive SummaryThe explosive growth of digital information and the need to successfully archive these data are matters which are receiving a great deal of attention from many organizations. The majority of this effort has been directed toward accessing and retrieving archived data. However, successful and efficient retrieval of data from an archive requires that the data was successfully and efficiently ingested and identified by an archive. It also requires that the archive follow appropriate policies and procedures to ensure the information is understandable and usable into the indefinite future. Appropriate standards can assist in meeting these objectives. These issues, and others, have been addressed at the conceptual level in a reference model developed by a US ISO archiving group under ISO TC20/ SC13/ (Aircraft and space vehicles/Space data and information transfer systems) and the Consultative Committee for Space Data Systems (CCSDS). This model, called the Reference Model for an Open Archival Information System (OAIS), was undergoing formal ISO and CCSDS review at the time of AWIICS. An electronic version of the OAIS Reference Model can be found at: The National Archives and Records Administration (NARA), the National Aeronautics and Space Administration (NASA), and ISO TC20/ SC13/ hosted the Archival Workshop on Ingest, Identification and Certification (AWIICS) on 13 and 14 October 1999 at the NARA Archives II facilities in College Park, MD. Based on input from the Digital Archive Directions (DADs) Workshop and further market interest surveys, the Workshop organizers suggested three primary topic areas for workshop focus and possible standardization efforts:
To register your interest in supporting one or more of these efforts, please use the following forms: ParticipationThe participants attending the workshop represented a wide variety of national and international organizations including government agencies, contractors, archives, academic institutions, non-profit organizations, and vendor's.
OrganizationThe workshop consisted of a review of papers submitted on needs or approaches to standards in each of the identified areas. The primary focus was on identifying work plans for specific standards, and on gauging the level of likely participation in developing each standard.
ConclusionsA diverse community of science data centers, libraries, electronic records and traditional archives sees the benefit, and is willing to participate, in developing standards in the following areas:
Relative to the OAIS Reference Model,
The proposed standardization efforts are nearly free of specific underlying technologies so that they are not subject to the rapid pace of technology change.
RecommendationsPursue the execution of the standardization efforts identified. An international consortium is needed to coordinate digital preservation issues across a wide variety of organizations, including science data centers, libraries, electronics records management, traditional libraries, and commercial organizations acting as both users and vendors of solutions. Next Steps
IntroductionPurposeThe explosive growth of digital information and the need to successfully archive these data are matters which are receiving a great deal of attention from many organizations. The majority of this effort has been directed toward accessing and retrieving archived data. However, successful and efficient retrieval of data from an archive requires that the data was successfully and efficiently ingested and identified by an archive. It also requires that the archive follow appropriate policies and procedures to ensure the information is understandable and usable into the indefinite future. Appropriate standards can assist in meeting these objectives.These issues, and others, have been addressed at the conceptual level in a reference model developed by a US ISO archiving group under ISO TC20/ SC13/ (Aircraft and space vehicles/Space data and information transfer systems) and the Consultative Committee for Space Data Systems (CCSDS). This model, called the Reference Model for an Open Archival Information System (OAIS), was undergoing formal ISO and CCSDS review at the time of AWIICS. An electronic version of the OAIS Reference Model can be found at: The National Archives and Records Administration (NARA), the National Aeronautics and Space Administration (NASA), and ISO TC20/ SC13/ hosted the Archival Workshop on Ingest, Identification and Certification (AWIICS) on 13 and 14 October 1999 at the NARA Archives II facilities in College Park, MD. Based on input from the Digital Archive Directions (DADs) Workshop and further market interest surveys, the Workshop organizers suggested three primary topic areas for workshop focus and possible standardization efforts:
ParticipationThe participants attending the workshop represented a wide variety of national and international organizations including government agencies, contractors, archives, academic institutions, non-profit organizations, and vendor's.
OrganizationThe workshop consisted of a review of papers submitted on needs or approaches to standards in each of the identified areas. The primary focus was on identifying work plans for specific standards, and on gauging the level of likely participation in developing each standard. The agenda gives a close approximation to the actual sequence of events.
Opening PlenaryDon Sawyer provided an introduction and background for this workshop. He explained the context for this work within ISO/TC20/SC13 and CCSDS. He identified its major activities to date as the development of an Open Archival Information System (OAIS) Reference Model, the conducting of a Digital Archive Directions (DADS) workshop in June 1998 and now this AWIICS workshop. He then reviewed the Open Archival Information System (OAIS) Reference Model, noting that it is a model as opposed to an implementation and, as such, can be used as a framework for the DADS and AWIICS activities. He itemized the functions an OAIS needs to provide and stated that descriptions of actual archives are included in annexes to the OAIS document. He then traced the operations of an OAIS from preparation of a Submission Information Package [SIP] through retention of the Archival Information Package [AIP] to the generation and distribution of a Dissemination Information Package [DIP]. He commented that the Reference Model is being well received by many organizations and is being used as the basis for some of their activities. He pointed out that at present, the OAIS Reference Model document is out for international review, comment and acceptance by both the CCSDS and ISO/TC20/SC13. He indicated that the document as available on the Web at: http://wwwclassic.ccsds.org/RP9905/RP9905.html [Webmaster's Note: now at ../../wwwclassic/RP9905/RP9905.html] and asked that any review comments this group might care to make be sent to him at: donald.sawyer@gsfc.nasa.gov. He reported that the next major effort had been the Digital Archives Directions (DADS) workshop held in the summer of 1998. Here the Reference Model had been exposed to a significant number of actual archive users, both nationally and internationally. He then itemized some of the major recommendations which came out of the DADS workshop. These included: establishing an international consortium to coordinate this work, the need for a data ingest methodology and an archival accreditation method. The opening plenary then continued with overviews by the three theme leaders and a paper which addressed items of general interest covering all three themes.
IDENTIFICATIONLou Reich (Identification Session Leader) briefly overviewed the identification issues. He noted the types of Identification as including: attributes of an object, its location and its name. He also identified the types of Identifiers in the OAIS Reference Model as Content Identifiers, Archive Information Package Identifiers and Storage Identifiers. Next, he listed some organizations currently working on Identification Standards and named some of their projects, including the OMG CORBAmed document titled Person Identification Service (PIDS) . In response to questions, he noted that:
Mike Martin mentioned "Standard Universal Product Codes (UPCs)" which are being developed for online marketing where products have to be uniquely identified across many suppliers.
CERTIFICATIONBruce Ambacher (Certification Session Leader) briefly overviewed the Certification issues. He defined Certification as making sure that what goes into an archive and comes out are the same thing. In olden days, the objects to be preserved were hard object and saving their integrity was not a problem. However, Certification has evolved to include the qualifications for persons or institutions to operate an archive. More recently, the process and the data itself may be certified. He then noted four approaches to Certification: Individual, Archival Program, Process and Data. He then described in more detail these approaches to Certification and considerations for each.
Collectively the Certification process(es) ensure a high degree of confidence that the information an archives disseminates is the same as the information it ingested and preserved, with full documentation for all necessary modifications. The planning committee anticipated that the Certification track would focus on those standards and applications which would ensure that information is preserved over time. He concluded that a good procedures manual is needed to serve both as a mechanism to identify all these items and as a compliance check list.
INGESTDon Sawyer's (Ingest Session Leader) presentation began by briefly outlining the functions assigned to Ingest within the OAIS Reference Model. From the model, the Ingest function was put into context, together with the other functions to be performed. The Model also breaks Ingest down into a number of sub-functions. He felt we should address the interactions between the archive and the producer, such as:
He provided a brief overview of several submitted papers relevant to the ingest session to show the attendees essentially what the session might be addressing.
Mike Martin's paper - The Archive Ingest ProcessDon Sawyer stated that this paper addresses key functions associated with the Producer to Archive interface and looks at these primarily from a producer point of view. It offered six steps, expanded in detail based on experience with the Planetary Data System archives, as constituting the Producer view of preparing for, and interacting with, an OAIS archive. He stated rather emphatically that we CANNOT continue to simply accept what is sent to the archive. We need to levy some responsibilities on the producers themselves to facilitate the archival process.
David Holdsworth's paper - Ingest Standards (and others) in the OAIS modelDon Sawyer briefly reviewed this paper. It addressed the sufficiency of representation information to represent the data. It also postulated the data preservation activity as having several steps:
Reagan Moore's paper - Persistent Archives for Data CollectionsDon Sawyer briefly reviewed those aspects of this paper that particularly related to the ingest session. The full paper was presented in plenary by Reagan Moore because it was applicable to all the sessions. The focus was on the data and information models needed to manage and federate the collections and to migrate them forward in time. To have persistent archives he felt it was necessary to:
This latter point implies the separation of the access mechanisms from the collections He felt ingest methodology and standards could be based on emerging digital library standards such as XML and DTDs. Proprietary formats need to be converted into open standards.
Parmesh Dwivedi's and William Callicott's paper - Archive Issues with the Evolution of Data and InformationDon Sawyer stated that this paper was a general interest paper that addressed the history of archives including the digital explosion that is forcing a radical change in order to handle digital information. It noted many issues and questioned if we would ultimately be successful, or if we will drown in a sea of bits and information. A companion presentation to this paper is Media Issues. PLENARY PAPER PRESENTATIONReagan Moore gave a presentation on his paper Persistent Archives for Data Collections. He noted three key items for archival strategy:
He supported the construction of a national data grid through the integration of local data caches, distributed data collections and distributed archives. His conclusions were:
Closing PlenaryThis plenary was held to hear the conclusions and reports formed by the separate sessions. Volunteers, up to 2 or 3 from each working session, were identified to return the following morning to draft key elements of the workshop report. The Executive Summary conclusions, recommendations, and next steps were generated during that morning session.
Identification Session ReportLou Reich gave the presentation on highlights from the Identification working group. Although this group did not identify a specific standard to be developed, it did identify the need to generate access scenarios and requirements that would lead to clarification of identification needs in an archival setting. This proposed effort was documented as an Identification Workpackage using the provided template. Much of the discussion in the Identification session centered on the definition of various terminologies. It was agreed that an access focus would help drive out identification requirements. During this plenary report, one participant felt there should be globally unique names and he offered to write a paper on naming rules to arrive at such a unique identifier.
Ingest Session ReportDon Sawyer gave the presentation on highlights from the Ingest working group. This group agreed that an Ingest Methodology standard would be very useful and 11 of 18 expressed an interest in supporting the effort. The group documented this proposed effort as an Ingest Workpackage using the provided template. The initial input document for this effort was identified to be Mike Martin's workshop paper. The papers from David Holdsworth and Reagan Moore were also discussed as to their implications on the formats submitted to archives. The importance of understanding the representation information was acknowledged, although there was not sufficient time to explore the issues in more depth. Many of the archives are currently in the position of having to accept whatever format is provided. It is anticipated that the development of an ingest methodology standard would lead to the ability to be more pro-active with Producers regarding acceptable data models for their submissions. In response to a Ingest session question about how an implementation of the OAIS concept of an Archival Information Unit might look, Don Sawyer's presentation also provided a view of such a unit that the National Space Science Data Center is currently adopting.
Certification Session ReportBruce Ambacher gave the presentation on highlights from the Certification working group. This group agreed that a standard Certification Checklist for Archives should be developed and 7 of 20 expressed an interest in supporting the effort. Following the workshop the session chair and his backup (Ben Kobler) documented this proposed effort as a Certification Workpackage using the provided template. The initial input document for this effort was identified to be the closing plenary report. The group concluded that the checklist should address qualitative issue, quantitative approaches, and metrics. It could be used in peer review, by management seeking to allocate resources effectively, and by an external body in evaluating archives.
Working Groups
Submitted PapersParticipants |
|
|
|
||
URL: http://ssdoo.gsfc.nasa.gov/nost/isoas/awiics/report.html
A service of NOST at NSSDC. Access statistics for this web are available. Comments and suggestion are always welcome.
Author: Archival Workshop Program Committee (archive_standards@nssdc.gsfc.nasa.gov) +1.301.286.3575
Curator: John Garrett (John.Garrett@gsfc.nasa.gov) +1.301.286.3575
Responsible Official: Code 633.2 / Don Sawyer (Donald.Sawyer@gsfc.nasa.gov) +1.301.286.2748
Last Revised: 1999-12-02, Don Sawyer (2004-04-20, John Garrett)