Archival Workshop on Ingest, Identification, and Certification Standards (AWIICS)
Archival Workshop on Ingest, Identification, and Certification Standards
DATE: October 13-15, 1999
HOST: The National Archives and Records Administration
8601 Adelphi Road
College Park, MD 20740-6001
Ingest Group Notes
Don Sawyer convened the Ingest Working group of AWIICS at 1430 hours on Oct. 13. The members of the group were:
Mike is working on planetary data systems (PDS).
There are three data systems.
Cassini is an example of a large project which will generate large amounts of data. The developers of Cassini are not concerned with archiving issues, which is a common theme we all deal with.
It's a concern of PDS to have an interface with these producers of data. There are processes to streamline these interfaces with smaller producers.
This talk is about the methodology PDS uses to deal with the projects that will produce this data.
The methodolgy is taken from the PDS Data Preparation Workbook.
Steve Marley discussed data access patterns that are orthogonal with data ingestion patterns. There was considerable discussion about this problem and recognition that it may be necessary to have the data organized in multiple ways if performance is an issue.
Mike has the same problem when he needs to collate date from multiple instruments so users can see the data overlaid.
You can't expect the data producers to do much more than guarantee the Data gets to the archive, and the archive is responsible to transform the data into a useable format. This appears to be the norm.
Steve Marley: At some level you need to realize that you are in a domain, and you need to optimize for that domain.
William Callicot, there can be a huge cost in unanticipated patterns of usage of data.
Oya Rieger: At Cornell the promise is to collect a very rich image collection, and to convert 'on the fly' to other formats as needed.
If there are multiple orthogonal access patterns that are time critical, multiple copies must be stored.
The enforcement of standards can lose your data providers. There should be a waiver that is possible but difficult to obtain.
Where is the integrity of the data ensured?
John Stegenga: In working with the small time publisher the rigor doesn't seem to be as necessary as in a large bureaucracy.
The more proactive we are, when the concept for data is being created, the easier it is to involve ourselves with archival concerns. From a commercial view, we are committed to capture the data as submitted by the publisher. If we notice errors, we may point them out and ask that the publisher re-submit.
Don: What are your biggest problems with the ingest process?
John S.: Negotiation with the publisher. We need to be very wary of Copyrights and creative integrity. We spend a long time negotiating and then an equally long time negotiating on a technical level.
Don: So then, formalizing these procedures may shorten the time spent in negotiation?
John S: Yes
jane C. What we get in is a deliverable on a product or project and the contractor gets paid whether the SIP is good or not as long as the missile works.
ISO 11179 was mentioned. This is a Meta-data standard - Standards for documenting data elements and for registering them.
Jane is more concerned with metadata aspects. Realistically not every archive has the clout to enforce standards.
A research library's interest is in longevity, whereas a publisher is interested in more short term concerns.
Action Item) Mike Martin takes an action item to provide the PDS manual via URL
You've got to show the data providers that you are providing a service that either saves them money, or helps them meet contractual obligations.
Don: Are you interested in a standard along the lines of what Mike Described if it were properly generalized to accommodate the library services as well as small archives?
EPA online has a software package you can download to create a metadata registry.
What is the minimum set of metadata ? Could this be standardized?
The web has impacted the relationship between the information provider and the archivist. There used to be a professional that mediated the transfer. Now there is a web service that provides the archivist function, except the web service doesn't completely fill the roll.
Is there a scenario where the producer is responsible to guarantee the longevity, and the 'archive' is really just a data store?
(Additional Notes are To Be Supplied)
A service of NOST at NSSDC. Access statistics for this web are available. Comments and suggestion are always welcome.
Author: Archival Workshop Program Committee (firstname.lastname@example.org) +1.301.286.3575
Curator: John Garrett (John.Garrett@gsfc.nasa.gov) +1.301.286.3575
Responsible Official: Code 633.2 / Don Sawyer (Donald.Sawyer@gsfc.nasa.gov) +1.301.286.2748
Last Revised: 31 October 1999, Don Sawyer