Important Concepts from the draft ISO standard "Reference Model for an Open Archival Information System (OAIS)"


The following figures and text provides a set of terms and concepts into which fundamental, long-term, archival activites may be mapped and thus compared. Long-term is long enough to be concerned about the impacts of changing technologies on the management, preservation, and distribution of archived information. This material has been adopted from the draft ISO standard entitled "Reference Model for an Open Archival Information System (OAIS)." For clarification of this material please see the full reference model document.



Figure 1: Functional component model for an OAIS


The functional areas of figure 1 include both systems and people needed to support the OAIS operation. The OAIS is an archive that meets a set of responsibilities as defined in the reference model document and this allows an OAIS archive to be distinguished from other uses of the term 'archive.'

Information objects (which may be any type of data), together with attributes needed for efficient ingest, archival preservation, and searching, are received from data Producers by the INGEST function using a Submission Information Package (SIP). The INGEST function does validation, adds supporting information as needed, ensures that the information is understandable to the designated Consumer communities, and peforms any transformations needed to put the information into archival storage forms. These transformations may include reorganizing and reformatting to meet archival storage and dissemination needs. The resulting information objects are sent to ARCHIVAL STORAGE using an Archival Information Package, and search information (e.g., Catalog data) used to support Consumer selection of archived data is sent to DATA MANAGEMENT as Descriptive Information.


ARCHIVAL STORAGE accepts Archival Information Packages, stores and manages them, and provides them to ACCESS and DISSEMINATION in response to requests. It also handles migrations of Archival Information Packages to new media when specialized domain oversight is not required to perform the migration.


DATA MANAGEMENT is the repository for all information used to support search aids, and for all information (outside of ARCHIVAL STORAGE) used to support the general operation of the archive. It stores all the Descriptive Information (catalog information) used to support searching and ordering. It stores all the request information generated by Consumers and by the archive in responding to requests. It stores all the information about Consumers.


ADMINISTRATION is responsible for coordinating daily operations of the archive, and for addressing the implementations of policy issues which impact multiple archival functions. In contrast, Management is a higher level function that oversees archival operations as only one of its responsibilities. ADMINISTRATION makes sure that necessary hardware and software are purchased and maintained, that security is maintained, and that the archive is using cost-effective technology and standards. It also oversees negotiations with data Producers on what is to be submitted to the archive, and it ensures that Consumers are generally satisfied with its services. It ensures the long-term preservation function is accomplished.


Consumers interact primarily with the ACCESS and DISSEMINATION function to find and receive information objects of interest. The finding aids used are supported by the catalog data (Descriptive Information) held by DATA MANAGEMENT. Requests to ARCHIVAL STORAGE yield Archival Information Packages (AIPs) which are processed as needed by ACCESS and DISSEMINATION to complete the order. Standing orders are processed automatically as the information becomes available and meets distribution requirements. Disseminations are provided as Dissemination Information Packages (DIPs) to the Consumer using some protocol (e. g., FTP, http, or tape).


Information Models


The other major dimension for the reference model is the modeling of information in the OAIS. A general model of archival information objects is shown in Figure 2. Much more extensive modeling is contained in the full document.


Figure 2: Archival Information Package Objects

The Archival Information Package (AIP) contains two primary information objects that are identified as Content Information (CI) and Preservation Description Information (PDI). The Content Information is that information which is the primary information submitted for preservation. The scope of what constitutes this information is agreed to between the archive and the Producer. To be complete, and preservable for the long-term, this information must include the associated Representation Information (or format information) that turns the Content Information bits into meaningful information.

Once the Content Information has been determined, it is possible to ask what constitutes the Preservation Description Information for that particular Content Information. The PDI includes several types of additional information that are needed to help preserve the Content Information. These are:

o Reference: How consumers can uniquely identify the Content Information from any other Content Information.

o Provenance: Who has had custody of the Content Information and what was its source. This would include the processing that generated it.

o Context: How the Content Information relates to other information objects, such as why it was created and how it may be used with other information objects.

o Fixity: Information and mechanisms used to protect the Content Information from accidental change.

The PDI information is needed for long-term preservation and its completeness is a key element in determining the quality of the archival function being performed.

Within the archive, the Content Information and Preservation Description Information need to be tracked and associated. This is done using the Packaging Information. For example, this may consist of some directory and file names, and their underlying implementations, on some medium. Or it may consist of a tar file together with some information relating the Content Information bits, its Representation Information, and Preservation Description Information.

Also associated with the Archival Information Package is the Descriptive Information. This is the information that is used to populate finding aids and is typically thought of as the catalogue information. It is this information that supports Consumer searches or that triggers the dissemination of information in response to a standing order.