ISO Archiving Standards - First US Workshop - Minutes

University of Maryland Conference Center
College Park, Maryland

October 11-12, 1995


(NOTE: We invite all participants to critique these minutes and to offer updates on any significant points they feel are missing or inadequately reflected.)

Wednesday, 11 October, 1995

The workshop was called to order by the chair, Don Sawyer. Don welcomed the participants and expressed his pleasure at the enthusiastic participation (50+) given the short notice available. Many had to be turned away because of space limitations. He was unsure how far this effort could go, but given the initial response he felt it was off to a great start!

Don introduced himself as a member of the NASA Goddard Space Flight Center and as leading a standards effort called the NASA/Science Office of Standards and Technology (NOST) under the National Space Science Data Center (NSSDC). NSSDC is an organization responsible for the archiving and dissemination of information obtained from space-related scientific investigations covering a range of scientific disciplines. Don stated that his formal education was in cosmic ray physics, but that he moved into the development of data systems and over the last 10 years into standards that facilitate information interchange. He has been participating in the Consultative Committee for Space Data Systems (CCSDS) which was formed in 1982 to improve the ability of the space agencies to interoperate. He noted that CCSDS has developed a number of standards, particularly in the areas of space-ground communications under panel 1 and information interchange under panel 2. A few years ago, CCSDS established a formal relationship with ISO Technical Committee 20, and this lead to the formation of Sub-Committee 13. Since then, a number of CCSDS standards have been advanced to Sub-Committee 13 and have then gone on to become ISO standards as well as CCSDS standards.

Recently, Sub-Committee 13 received an archiving standards task that was transferred from Sub-Committee 14. This was handed off to CCSDS Panel 2 and Don agreed to lead the CCSDS Archiving Work Package. It is his view that unless the archiving effort receives broad participation, the resulting ISO standards are unlikely to be very effective. With this in mind, it has been his objective to get broad US participation in this effort. He noted that this workshop is just one step in the process and despite the great response, there are surely many others who will want to get involved.

Don introduced the program committee which, in order to get moving quickly, consisted only of himself and two support contractors - Lou Reich , with the Computer Science Corporation - and John Garrett, with the Hughes STX Corporation. The participants then introduced themselves.

After addressing logistics matters, the agenda was reviewed and accepted as follows:

            1995 US Workshop on ISO Archiving Standards 

           University of Maryland Conference Center
                     College Park, Maryland

                      October 11-12, 1995

------------------------------------------------------------------

Wednesday October 11, 1995

8:30
     On-Site Registration
9:00
     Welcome and Meeting Logistics
     Don Sawyer/NASA
9:15
     Review and Possible Update of Proposed Agenda
     Don Sawyer/NASA
9:30
     Overview of ISO archiving standards effort including phased
     implementation approach and current schedule
     Don Sawyer/NASA
10:00-10:15
     Break
10:15
     "Draft Archiving Referencing Model"
     Lou Reich/CSC
10:45
     Preliminary Discussions on Archiving Reference Model
11:00
     Donald Strebel/Versar, Inc.
     "Metadata Standards for Scientific Data Archiving and
      Publication"
11:15
     Steve Louis/Lawrence Livermore National Lab
     "ISO RM-ODP Standards"
11:30
     Steve Louis/Lawrence Livermore National Lab
     "Adding Policy to the Draft Archiving Reference Model"
11:45
     Blanche Meeson/NASA GSFC Code 902.2
     "The Publication Analogy: A Conceptual Framework for
     Scientific Information Systems"

12:00-1:00
     Lunch

1:00
     David Winfree/CIESIN
     "Distributed, Seamless Interface Access to On-line Data and
      Metadata Across Heterogeneous Archives"
1:15
     Dave Skinner/Fujitsu
     "Archives in Relation to the IEEE Open Storage Systems
      Interconnect (OSSI)Reference Model"
1:30
     Russell Davis/Boeing IS
     "Archive Integrity"
1:45
     Joel Williams/Mitre Corporation
     "Strawman Standard for Tape File Storage"
2:00
     Mike Martin/Jet Propulsion Laboratory
     "Planetary Data Systems Archive Standards"
2:15
     Ronald Burns or Tim Daniel/Central Imagery Office
     "CIO's Common Imagery Interoperability Facility"
2:30
     Break
2:45
     Thomas McGlynn/NASA GSFC Code 668.1
     "The HEASARC's Generic Archive/Catalog Protocols"
3:00
     Alex Szalay/Johns Hopkins University
     "The Archival System for the Sloan Digital Sky Survey"
3:15
     John Rainey/Teledyne Brown Engineering
     "Relating the BMDO Missile Defense Data Center to the
      Reference Model"
3:30
     Ted Meyer/NASA GSFC Code 505
     "The EOSDIS Archive Architecture"
3:45
     Break
4:00
     Identify 2 or 3 subgroups
     Determine subgroup logistics for following day
     Assign work items to subgroups

5:00
     Break for dinner
6:30
     Dinner    (no host at 94th Aerosquadron restaurant)

Thursday October 12, 1995

8:30
     Day II Kick-off
     logistics; announcements; etc.
     Don Sawyer/NASA
9:00
     Subgroup Breakout Sessions

12:00-1:00
     Lunch

1:00
     Continue Subgroup Breakout Sessions
3:00
     Reports Back From Subgroups
4:00
     Develop Input to International Meeting
     develop consensus on US contribution
5:00
     Call to Action
     Don Sawyer
     plans for future modes of communication;
     working group sign-ups;
     plans for subsequent meetings;
     commitments for follow-up activities.
(NOTE: The WEB version of these minutes will link the presentation titles to those presentations made available in electronic form. Workshop participants may receive hardcopy as well.)

The presentations were given approximately as scheduled and there were many comments to the effect that a lot of useful information was provided. However it was also noted that the lack of a common framework for archives made it difficult to fully inter- relate some of the presentations.

A few issues were raised by some of the presentations.

1. There was considerable discussion as to whether the "Publication Analogy", presented by Blanche Meeson, applied to all disciplines. Bob Hanisch thought it did not apply well to the way astronomers work.

2. A couple of the presentations included attempts to map their systems against the draft reference model. These presentations were "CIO's Common Imagery Interoperability Facility" by Ronald Burns and "Relating the BMDO Missile Defense Data Center to the Reference Model" by John Rainey. While there was considerable alignment, there were also inconsistencies. These need to be investigated further.

Following the presentations and an extended break, the participants reconvened to consider how to break into subgroups. It was proposed, and agreed, to break into the following groups:

A. Reference Model services, interfaces, distributed systems, policy (Volunteer facilitator: Lou Reich)

B. Vocabulary, data modeling, and policy/operations modeling (Volunteer facilitator: Don Sawyer)

C. Media, hardware, software (Volunteer facilitator: Joel Williams)

A show of hands indicated approximately equal numbers volunteering for each group. However it was decided to wait until the morning plenary to firm the participation by sub-group.

Thursday, 12 October 1995

The plenary session was called to order by Don Sawyer. The general charges to each sub-group included the following:

o Enumerate the standards in use in your archives

o Everyone should have the opportunity to describe their key issues relating to this effort

o Be sure to address the assigned topics, but you are not limited to these topics only

o Ask if your group feels this archiving standards effort should continue

o Consider what should be the result of this effort after three years.

Given that the schedule in Don Sawyer's presentation called for a Reference Model within three years and the beginning of an effort to map standards to the Reference Model interfaces, address:

- Impact of the Reference Model

- Impact of mapping standards to the interfaces

- Others?

o What should be the scope and applicability of the Reference Model?

o Prepare sub-group reports that address:

- Responses to the general charges above

- Issues raised

- Agreements reached

Before breaking into the sub-groups, Don addressed the issue of the level of future participation. A form had been prepared on which the participants could indicate the level they, and their organization, was most likely to participate at over the next year. These forms were to be returned prior to participant departure.

Sub-Group Minutes

A. Reference Model Session

Participants

Bruce Ambacher - National Archives, Elise Blaese - Hughes STX, Ronald Burns - Central Imagery Office (CIO), Sherri Calvo - NASA/GSFC, Randal Davis - Univ. of Colorado, Russell Davis - Boeing, Mark Demulder - USGS, Joseph King - NASA/GSFC, Steve Louis - Lawrence Livermore National Laboratory, John Rainey - Teledyne Brown Engineering, Louis Reich - Computer Sciences Corporation, Tim Rykowski - NASA/GSFC, Paul Singley - Oak Ridge National Laboratory, Michael VanSteenburg - NASA/GSFC, David Winfree - CIESIN

What is an archive?

Reich: Some specific issues to consider:

o Does archiving include data that is changing (being added to or refined) or is it restricted to data that doesn't change except under special circumstances?

o Does an archive have to be open to all potential users or can access be restricted to a specific group of users?

o Does archiving encompass the routine generation of derived data products (as will be performed by the EOS DAACs) and the on-demand generation of specific data products to meet the needs of a particular user?

o What about non-digital data (for example film)? What about physical items like samples and specimens that don't meet the normal definition of data?

VanSteenburg: An archive should be considered to be distinct from the processing that goes into generating the information stored in the archive and shouldn't include the processing required to generate new products from archived products.

King: Agreed that value-added processing is not part of archiving. But on- demand creation of higher-level data products from lower-level products may be the most efficient way for an archive to meet user's needs (reflecting the trade-off between costs of data storage and processing), hence we shouldn't preclude it.

VanSteenburg: This raises the issue that the archivists may need to keep and maintain software as well as just data, which is very difficult to do. It's difficult to keep software running for long periods of time, since hardware platforms, operating systems, and languages will all continue to change significantly. It also means keeping sufficient staff expertise over long periods of time to generate new products (since some judgment about the correctness of those products must be made). In general, you can't archive expertise and archivists shouldn't need to be expert in a data set to preserve it.

Randal Davis: What's the difference between an archive and a digital library?

Ambacher: The archive is the original material, established for purposes of preservation. An archive is frequently associated administratively with the organization that was responsible for generating the data. A digital library concentrates on distribution: what it stores need not be the originals and it may be administratively separate from the organization that generated the data.

Reich: What about product history - will archivists and users need to know everything that has gone into making a dataset? How would we represent such a history? Where does product history fit into the reference model?

Louis: Product history doesn't require its own functional box in the model. Product history is a particular kind of metadata, and metadata is already represented in the reference model. The rules regarding how much processing history to keep, and how to keep it, can be governed by policies set up by the archive administrators.

Burns: There are two key aspects to an archive that are different from other data systems: data storage (in terms of its volume, longevity, etc.) and the need for metadata.

Rainey: Transaction management is not an archive problem - version management is.

Louis: A key aspect of an archive is permanence: the archiving process should be thought of as preserving data forever.

Winfree: There can be different levels of archives: national, project, etc. Some archives may be more permanent than others, so to say that an archive is forever is not always true nor should it be a necessary condition.

Reich: When defining the concept of archiving, we can say that the quality of data provided and the levels and kinds of services provided are more relevant than the longevity of the archive.

Louis: Who exactly is going to be required to conform to an internationally accepted standard on archives?

In the discussion that followed this question, it was generally agreed that sponsor organizations - governmental and private - that wanted to assure that the proper quality of products and services were being provided and that their money was being used wisely would want standards against which to measure their archives. Organizations and individuals producing relatively small collections of data to be available over the internet would probably find such standards to be burdensome and unnecessary.

Reich: The difference between a data warehouse - for example electronic business records kept by a commercial company for long periods of time - and an archive is openness: an archive is to serve a more general need and a larger community. We cannot predict exactly the uses for data in an archive; hence the role of metadata is key to an archive, while there may be only minimal metadata needed for a data warehouse.

King: An archive holds data that is correctly and independently usable by its intended user community when withdrawn. This implies the existence of metadata that can be used to interpret the form and content of the data.

Louis: Does access to an archive always mean access using the metadata?

In the discussion that followed this question, many panel members (though not all) believed the answer to this question was yes: users would always need to first use metadata to locate and retrieve the data of interest. Appropriate metadata would always be bundled with data withdrawn from an archive.

Burns: Many of the terms that we use in association with archiving D like catalog, directory and inventory D all refer to metadata. A catalog is a structured collection of metadata

Louis: To make the archive more robust D for example, to allow for changes in the storage medium or location D the data storage component should be abstracted and hidden from users. Although some users may know how to go in and directly access some piece of data, users would almost always be directed to the data they want by first perusing the metadata.

Burns: Do we restrict an archive to digital data only?

King: There are three classes of data that we must account for: replicable digital data; replicable analog data (like film); and non-replicable material (samples, etc.).

Randal Davis: Archiving initially meant written material, or by analogy replicable materials; this leaves out non-replicable material like samples from the scope of an archive.

In the discussion that followed, it was generally agreed that our use of the term archive should be restricted to replicable digital data. Information in an archive could describe and point to other material, like samples. Usually digital material is replicated by an archive and then distributed. A library can provide a user with its only copy of data; an archive would not.

Reich: So what are our goals in developing archive standards and a reference model?

In the discussion that followed, the following were identified:

o Promote more and better use of archives through uniformity of interfaces.

o Help assure the quality of an archive.

Winfree: All the functions identified by the reference model aren't necessarily done in one place. For example, there may be multiple storage sites (with or without redundancy of data access) and metadata may not be colocated with data.

King: There are two distinct forms that an a distributed system might take: a single archive composed of federated elements; and a federation of archives. An example of the latter is a Master Directory that might have metadata that points users to other archives.

In the ensuing discussion, it was generally agreed that just having metadata is not enough to make an archive: it is necessary to also have the data. An archive should have all of the components of the reference model, possibly distributed, but under a single management/stewardship domain. That tends to rule out a federation of archives if each archive is under separate management.

Break for lunch, followed by a plenary session to discuss the morning's work. The panel meeting continued after the plenary.

Reich: Is the reference model presented during the day before appropriate? How can it be made better?

There was general agreement that the original reference model, amended with Steve Louis's addition of a management/policy functional box is a good start. The Louis variant of the model should be further refined by adding lines from the management/policy function box to all other function boxes.

In the discussion that followed, the following points were made:

o The ingest function can both accept and generate metadata.

o Do we need to include processing capability to generate products different from what's stored? Probably yes, but it's not quite clear how to reflect this in the model. It could go in the storage function or the distribution function.

It is possible to set management policies that do away with the need to include the generation of products within the model: when the administrators sense that customer requirements or standards have changed, they may choose to re-generate data products and the metadata describing them. This amounts to extracting data from the archive, performing some processing outside of the archive, and then re-ingesting the new results.

o The data model for ingest can be different from the model for dissemination.

Over time, the way in which users expect data may change (for example, a newer file format) and the archive may need to adapt to these changes by performing some processing on data it disseminates to put them into the proper schema or format expected by the user.

o We will need access models to help us understand and explain to others how users use the archive.

o Security is either in the management/policy box or it is distributed over several functional boxes (ingest, storage, distribution). The problem with the latter approach is that security needs to be "holistic" and well integrated across the entire system. The general feeling is that security policies are set by management and then enforced by an agent within the appropriate function (ingest, etc.).

B. Vocabulary, Data Model, Policy/Operations Modeling Session

Participants

Jimmy Voyles - Pacific Northwest Labs., Tim Daniel - Central Imagery Office (CIO), Don Strebel - Versar, Inc., Sethanne Howard - NASA Hq., Arnold Rots - NASA/GSFC, John Turner - NASA/JSC, Betty Brinker - NASA/GSFC, Bill Taylor - Nichols Research, Don Sawyer - NASA/GSFC, Jim Thieman - NASA/GSFC, Tom McGlynn - NASA/GSFC, Ted Meyer - NASA/GSFC, Robert Hanisch - Space Telescope Science Institute

Discussion

The breakout group discussion began with a listing of issues of interest. These include:

o The definition of "archive".

o Linkage of reference model and existing standards

o The expense of conversion of databases between standards

There was a brief discussion about the last issue. The archive reference model is not expected to force databases to convert. The definition of archive will affect how many databases are expected to fit within the model. Should it include older data in the model? If it does, it still does not mean that the services are mandatory.

The reference model is intended to give a framework with common vocabulary for people to use and defined services at the interfaces. Tim mentioned that the CIO is redoing the databases of imagery within the DOD community. The databases are not interoperable. A broad reference model is needed in this case if it is to help in this problem.

The group next discussed the terms needing to be defined (the list was modified in later discussions to include the terms listed below):

Metadata                      Archive
Information                   Ingest
Granule                       Access
                              Dissemination 
                              Retrieval
                              Reference Model

Preservation                  Long-term/time scales
Maintenance                   Validation
Metadata Management           Catalog
Service                       Directory
Security                      Inventory
Data Modeling                 On-line
                              Off-line
                              Near-line
                              Products
                              Querying 
                              Browsing
                              Accept
                              Collect
                              Policy
                              Delete (data)
                              Retire (data)
                              Objects
                              Archive life cycle
                              Instance
                              Physical

Data was defined to be the representation form of information, or knowledge, that can be exchanged. This is an ISO definition.

Information is any type of knowledge that can be exchanged. This is an ISO definition.

Digital data - data represented by integer systems

There was a question of whether existing definitions might exist which could be adopted. One suggestion is the terminology in the US. document 44 USC - 36 CFR, sections 1228-1234,

We can define knowledge as a known fact.

Some examples of data are:

Digital data - obvious

Analog Data - Analog tapes, paper, film, specimens (e.g., biological, moon rocks)

The group believed the scope of the Reference Model should include analog data since many archives have that type of data and a single Reference Model for the archive, at some level, would be desirable.

It was agreed that Metadata would be defined as "data about other data."

It was agreed that an archive is a repository that preserves information for use by a designated community

Is an archive an information system?

There were questions of how much access is required - should be for a designated community

Archive definition, including long-term, may be problematic

- concerns about what is long-term for various communities were raised

Ingest - the incorporation of new information into an archive

Does this include storage and/or acquisition?

Does it include internal operations? (e.g. value-added creation and ingestion of that).

Are these value-added operations outside the archive? It was generally thought that they were.

If so, then no archived information is ever generated within an archive -

Is there an example of storage without ingest?

Going back to reference model paper

ingest included: accepting information; confirmation of receipt; validation; preparation of metadata

Maybe ingest box should instead be called "acceptance"

Should we continue this archive standards effort?

Yes - carry forward and see if the effort progresses toward the goal

Data modeling -

Must a data model be in mind before archiving?

Is it really a metadata model we are discussing?

Some modeling has to be done to enable interoperability among archives

Scope should include archive-to-archive sharing

Needed to make user access effective

Required to convert data into usable information

Data models can be reflected in the metadata used and include relationships among the various elements

Users of archive need access to archive data model

Various levels of details in data models in archive and outside users may not be interested in all levels

Different data models are present

- Archive as a whole

- for data disseminated

Bottom line - data modeling is important to the overall archiving standards effort

Importance of data model depends on amount of interoperability needed

Might be useful to identify common elements of data models

- structures such as dictionary or thesaurus

Data model does not encompass reference model since it does not include process

What are generic elements of the data models?

There are some - categories such as "units," temporal, spatial

Following lunch, a brief, unscheduled, plenary session was held to assess the progress.

Reflection on plenary session

A goal of an archive is to preserve information from degradation and corruption. This must include digital and analog data. Tradeoff between access and degradation was mentioned, since it seems the most well preserved data is also protected from access

Common services (photographs, copies, catalogs) can be identified among archives for interoperability and standardizing user interface. Common tools may result (having prototypes was mentioned). Guidelines can be provided, applicable to certain types of archives.

Concern about not giving out "original" data -

"what is original in digital data?"

Specimen data are given out, however, and they may not be returned in the same condition as given out. For example, moon rocks are released to a researcher, a small sample is removed and analyzed to obtain new information, and the remaining specimen is returned.

Some data (specimens, digital) are used to get or evolve to other data

Perhaps change our archive definition to:

a repository that intends to preserve information for use by a designated community

Also, change digital data definition to representation by digits

Concern about "preservation" because some data may be wrong and have to be deleted

General feelings that the definitions are close enough to go forward

What should be the result of this archive standards effort after 3 years?

Should be at least 2 instantiations of reference model by then

If model is broad enough it may already fit many archives

Some amount of interoperability is important - demonstration of this is important

Retrofitting of existing archives-

Definition of a public interface applied in a common way -

GILS may be example -

Sufficient framework that an interface could be written without intimate understanding of archive

o Ability to understand and evaluate all archives with regard to:

- use of new technology

- individual archive experience

- migration issues

o Assisting in education of new archivists

- resulting in ability to improve archive functions of new and existing archives

o Improve ability to cost the development, maintenance, and operation

o Ability to identify what is an archive and a clearer view of how to provide public interfaces to archives

o Better vendor support in addressing archive needs

Issues

o How to characterize archives by time scales - long-term vs. short-term

o Should we start with archive reference model as a box with external interfaces?

o Overall scope of reference model?

C. Media, Hardware, Software Session

Participants

Joel Williams - Mitre Corporation, Paul Grunberger - Applied Physics Laboratory, Alan Wood - NASA Life Sciences, John Garrett - Hughest STX, Russell Cryder - Naval Warfare, David Skinner - Fujitsu, Mike Martin - NASA/JPL, Larry Langdon - Census Bureau, Linda Kempster - IIT Research Institute, Patricia Carreon - NASA/GSFC

Paul Grunberger (APL Johns Hopkins): Concern about the problem of recovering data after the OS and other support software changes.

Alan Wood (Life Sciences): Concern about data dictionaries--need these to be able to communicate effectively.

John Garrett (Hughes STX): Major interest in database work. Would like to find out more information about the media and hardware as these things relate to archiving.

Russell Cryder (Naval Warfare--Telemetry): Concern about lost data, and lack of funding for a place to put tapes, or to put them on denser media. Many tapes lost because of non-existent environmental controls, and clerical error (went into the large object compactor along with an automobile).

David Skinner (Fujitsu): Need to define the interface to the balck box which is the archive. That is the real work of the standards committee. Would like to see:

o a set of requirements on data systems and storage systems.

o define 'archive', 'archiving', and in general a set of standard terms

o understand legal requirements on archiving

o a standard architecture

o standard processes and policy, and a means of communicating the policy

o standard metrics

o end of process - 3-5 years out - standards for interoperable systems

Their current estimates for cost of storage is $5-$7 per megabyte when it includes items like labor. They estimate the cost to recreate a megabyte of lost data at $1250.

Mike Martin (JPL): Is currently migrating from 7 track tapes to CDs. Wants to communicate with others about real life experiences. Thinks it is important to test the media in a particular batch before using it.

Larry Langdon (Census Bureay): Problem of retaining institutional memory in a large and long-lived archive. Sees the need for metadata.

Linda Kempster (IIT Research Institute): Knows everything about the characteristics of every piece of removable media, and took a year off to write a book. Sold several of her books to members of the group.

General discussion:

Analogy of putting the compression algorithm with the compressed data. You should store the programs, metadata, calibration data, etc. which are used to interpret the data along with the data. But if it compiles now, will it compile later?

Additional perspectives from this group are contained in the plenary report viewgraphs below.

Plenary Reports from Sub-Groups

A. Reference Model Group Report

VG 1:

Characteristics of an Archive

o Archive holds data that are correctly and independently usable by the intended user community when with drawn

o Metadata must be sufficient to be used by the intended user community

o Policy deals with deletion and migration

o Categories of data supported

- Replicable digital data interfaces are intended for standardization (interfaces to storage)

- Any type of data may be pointed to by an archive

o An archive has a single management/stewardship domain

Variable Factors (do not matter)

- Lifetime of an archive

- Value added services (level of user services)

- Level of openness

VG 2:

Extensions to the Reference Model

o Security - yes

o Policy / Operations - yes

o Distribution - not federation

o Transactions - no; version control

B. Vocabulary, Data Model, Policy/Operations Model Group Report

VG 1:

Definitions:

Data: Representation form of information

Information: Any type of knowledge that can be exchanged

Digital Data: Representation by digits

Type of data: analog tapes, paper, film, specimens, digital data

Metadata: Data about other data (one-way pointer implied)

Archive: A repository that intends to preserve information for use by a designated community

It was suggested that the long list of terms in the minutes of the subgroup needed to be defined in the context of the reference model.

VG 2:

Data Modeling

o Needed to convert data into useful information

o Needed to make user access effective

o Users of archive will need access to the archive data model

o Different data models are present

- archive as a whole

- for the data disseminated

o What are the generic elements of the data model?

- There are some - categories such as 'units', 'temporal', 'spatial'

VG 3:

Three Year Impacts

Reference Model

o Ability to understand and evaluate all archives with regard to:

- use of new technologies

- individual archive experience

- migration issues, etc.

o Assisting in education of new archivists

o Greater ability to improve archive functions of new and existing archives

o Improve the ability to cost the development, maintenance, and operations of archives

o Ability to identify what is an archive, and a clearer view of how to provide for public interfaces to archives

o Better vendor support in addressing archive needs

VG 4:

Issues

o How to characterize archives by time scales - short term, long term

o Should we start with an archive reference model as a box with external interfaces?

o What is the overall scope of the reference model?

o Who is the intended audience for this reference model?

C. Media, Hardware, Software Group Report

The purpose of the first view graph is to show the archiving activity surrounded by a collection of other activities for which there is either established practice or standards. Part of our activity in this group should be to identify those standards and practices in those "helper" disciplines which make the archiving activity easier and more secure.

VG 1: (transformed to text format)

    Security        Hierarchical Storage System

                                                    Data Base
 Backup and
  Recovery            ARCHIVAL SYSTEM       
                                                     File System
    Digital and
     Analog Data                                Metadata
                      Transaction                 Management
                       Processing

o Incorporate existing standards

o What are the appropriate standards?

VG 2:

Key Issues

- Migration (KEEP IT SIMPLE)

o Media

o Metadata

o Systems Software

o Archive Software

o Application Software

- Archive differs from other system with lots of data

o Long Term

o Accessible by outsiders

o Perhaps infrequent use

o Document to non-specialists

VG 3:

Activities

o Look for existing standards

o Find out statistics on reliability of media

o Web page

o Broader attendance

- Oil-geophysical industry

- Vendors

VG 4:

Observations

o Broad - wide ranging activity

o Need to determine meeting schedule

o Security is an issue

Future Planning

A major objective for this workshop was to prepare material for the ISO/CCSDS workshop taking place 24-26 October. It was proposed that the result of the subgroup deliberations, even though not 100% consistent, be taken to ISO. In addition, Lou Reich volunteered to make an update of the draft paper on the Reference Model by adding a Policy function and by casting it into an object oriented framework before taking it to the ISO/CCSDS workshop.

Don identified the following planned/proposed meeting schedule:

International ISO/CCSDS Workshop 24-26 October Oxford, England

US Core Group meeting 19-20 December Washington Area hosted by Mitre

US Core Group meeting March Need host volunteer

International ISO/CCSDS Workshop 29-30 April Pasadena, CA

It was proposed that the Core group would consist of those who could devote 10% or more of their time to this effort, and they would generate new material and attend meetings. Of those still in the room, the following individuals identified themselves as expecting to participate at this level:

Steve Louis, Mike Van Steenberg, Paul Grunberger, Linda Kempster, Joel Williams, John Turner, Russell Cryder

The participation forms turned in at the end of the meeting had not yet been reviewed. In addition, we know there are many others out there who could not attend, or have yet to hear of the activity.

Those who are not generating material will be able to participate by reviewing and commenting on the material, or by just tracking the effort. The material generated will be put on the Web site, and if you do not have Web access, provided by other means. You will also be able to register your desired level of participation through the Web registration form.


Wider Views

Overview of First US Workshop
Overview of US Effort
Overview of International Effort


URL: http://ssdoo.gsfc.nasa.gov/nost/isoas/us01/minutes.html

A service of NOST at NSSDC. Access statistics for this web are available. Comments and suggestion are always welcome.

Editor: Don Sawyer (sawyer@ncf.gsfc.nasa.gov) +1.301.286.2748
Curator: John Garrett (garrett@ncf.gsfc.nasa.gov) +1.301.441.4169
Responsible Official: Code 633.2 / Don Sawyer (sawyer@ncf.gsfc.nasa.gov) +1.301.286.2748
Last Revised: October 22, 1995, Don Sawyer (January 30, 1997, John Garrett)