ISO Archiving Standards - Fourth International Workshop - Minutes

DLR
Oberpfaffenhoffen, Germany

3-8 November 1996


Present:
BNSC David Giaretta DG
Christiane Nill CN
CNES Patrick Mazal PM
Denis Minguillon DM
Claude Huc* CH
Hubert Biondi HB
DLR Martin Pilgram MP
ESA Nestor Peccia NP
Gian Maria Pinna GP
NASA Don Sawyer DS
Lou Reich LR
John Garrett JGG
Bob Stephens BS
Mark Hei MH
Betty Brinker BB
David Townley DT
Stephen Voels* SV
Alan Wood* AW
NASDA Yoshio Inoue YI


Table of Contents


WP700 - Archiving

DS informed P2 that ISO does not require a conformance section.

The next OAIS version will be issued on July 1 prior to the next US workshop. Before the next P2 meeting, NASA will host 2 US workshops (July, September).

The WP700 Status Report is attached.

DS pointed out the importance that P2 follows and meets the ISO schedule.

DS provided an agenda for the DA meeting. He noted that not everyone had had a chance to review the latest version of the RM as well as the other papers. It was noted that some time might have to be devoted to the reading of the document since NASA was not planning any specific presentation on this latest version.

DS indicated there would be a review of CH and CN comments.

The fall IWS will be at ESRIN and the first two (plus) days, Oct 27 and 28, are again proposed for the Archiving Workshop.

Progress report re. recent DA activities

DS reported progress which included:

He stated that ISO does not need a conformance section in a Standard. Therefore DS believes we should return to the original plan, i.e. the RM should be a Standard as opposed to being a Report.

The DIS (or CCSDS RB) should be given to ISO as opposed to waiting until a CCSDS document becomes Blue. This will be discussed by MC. Currently, ISO just puts its cover sheet on each CCSDS BB and distributes them for review. This change in procedure.would reduce by one the needed review cycles and save about a year in the review period.

NP suggested we try this with each P2 document


Review Status of action items (DS)

The status of the archive activity action items is as follows:

P/9611/27 Closed.
P/9611/28 Closed
P/9611/29 Presentation in this meeting
P/9611/30 Closed (N12)
P/9611/31 Open
P/9611/32 Open
P/9611/33 Closed
P/9611/34 Closed
P/9611/35 Open
P/9611/37 Closed
P/9611/38 Closed
P/9611/39 Closed
P/9611/40 Closed
P/9611/41 Closed


OAIS Promotion Report

DS gave a presentation of the OAIS RM promotion activities (slides attached).

DS talked about four promotional activities re. the RM in which he has participated:

All are interested in the RM.

A major concern from NSSDC was how this model would work with other federated systems which are underway.

DS felt the Control Authority is a confederated system in itself.

We need to focus more on applications of the RM.

DS noted that the RM is getting increased visibility and encourages other agencies to look for promotion opportunities.

NP informed that DARA (Germany) developed a common archiving system (Oracle based).

Action Item: NP to provide to DS more detailed information on this system by June 4.
NP/9705/31


CEOS archiving session

CH reported about the first CEOS Archiving Task Team meeting (slides attached). At that meeting the OAIS RM was not presented.

He described the objectives and benefits as well as the program content and plans. There were two parts to the meeting:

He noted two points in the program:

For telemetry data, digital data needs to resynchronized

He indicated where and how he felt CCSDS could assist as:

CH posed and answered the question re. "What could CCSDS suggest?"

He discussed future plans for the activity which would include consideration of the model.

DS: CH's proposed activities would be useful things for CEOS to do. CEOS looking at our vocabulary would be a KEY activity
GMP: CEOS had considered a common vocabulary and has almost given up on the activity feeling it would be nearly impossible.
DS: the glossary cannot be done successfully until there is a common model and understanding before terms can be resolved. CEOS should look at the RM in the sense of deriving a glossary

Re. the system view, it was noted we can use the RM as an aid in this area. It was reported that there.is no real "CEOS archive" outside of existing archives. CEOS' intent is to facilitate access to a distributed system; it was noted that this is also the OAIS objective.

In the next CEOS meeting, a CCSDS P2 representative should give a presentation on the OAIS RM.
LR volunteered to be present if PTT meeting and ATT are co-ordinated.

If nobody is able to attend, then Wyn Cudlip, who is already CEOS liaison to CCSDS P2, should give a presentation.

Action Item: DS takes action to insure that CCSDS P2 is presented at the next CEOS ATT meeting.
DS/9705/32

NP pointed out that the title archive might lead easily to confusion, since archive is differently defined in diverse projects, committees and organisations. GMP stated that CEOS might be concerned with the title archive, since they will understand storage. He proposed that for presentation purposes to CEOS a different title should be chosen.

LR proposes to make a presentation to the PTT, which is at the same time as the WGISS meeting.

CH stated that the next two meetings will be critical to achieve compatibility between CEOS/CCSDS archiving efforts. He recommended a strong CCSDS presence at these meetings.

LR: ICS (interoperable catalog system) exists within the PTT
DS: if one just talks about storage and then the surrounding system, one will end up with a system similar to the OAIS. We need to raise the issue with CEOS that there is no incompatibility between the two activities. We also must avoid the inference that the RM is an ARCHITECTURE. CH recommendation 3 should be done, but by whom? CCSDS is not in a position to do this. We can only suggest topics since these are CEOS tasks

It was suggested that we wait until the task team's reaction to the RM has been ascertained

LR: Want our model to interface with IEEE model down to the storage level

Action Item: All agency representatives to brief their CEOS counterparts about OAIS activities, by June 20.
All/9705/33

Although ESA currently does not have a CEOS representative, GMP indicated his interent in continuing to work with CEOS

There was some discussion of real implementation cases.

GMP informed that a specific commercial system (AMASS) had been selected for their archive management system, but that ESRIN wanted to be able to keep the original format. The solution was that EMASS (the developer company of AMASS) agreed to provide a specialised component to adopt the ESRIN foreign format.

NP stated that in astrophysics the archiving format and distribution format are identical.


CNES comments on OAIS (N10)

Chapter 1: no problems with comments.

Chapter 2:

Discussion points for OAIS RM White Paper Review:

Chapter 3:

3.2.1 re. Logical Model


BNSC comments on OAIS (N11)

Section 1:

Section 2:

Section 3


Reference Model schedule

ISO DIS CCSDS RB May 1998
ISO IS CCSDS BB Nov 1998


Review of OAIS White Book, Version 1

Section 1.1 Purpose and Scope comments:
AW:

Discussion:

GMP: mainly focused on preservation and not on an access, dissemination, etc. Exploitation of data is essential as well.

CH: two goals: first to preserve the data and second to allow usability of the data in the long-term (access, dissemination).

DG: questioned the meaning of "expands consensus on the requirements"

data OR the consesnus on the techniques for archiving data.

P2 review of US Workshop comments to purpose and scope:

Updates to White Book - Section 1.1 Purpose and Scope:

Action Item: DS/LR to provide a new scope/purpose by June 14 according to the decision above. DS/LR/9705/34

It was agreed that the White Book should include an abstract in Foreword.

Action Item: LR to write for next version an abstract by September 30 LR/9705/35

Section 1.2 Applicability comments:
DG: clarify the point that an organisation might be temporary, but might be responsible for long term preservation.

Section 1.3 Rationale comments:
AW/GMP: does not need to be there, redundant with 2.1

Section 1.4 Road-Map comments:
GMP: bullet 3: is the user a human or can it be machine? Also would like to add extra bullet for Archive Digital data Identification (e.g. file-naming convention)

Updates to White Book- Sections 1.2 to 1.5:

Action Item: DS to update White Book according to these agreements by July 1.
DS/9705/36

Section 1.6 comments:
GMP: introduce client and then define consumer using this term. Is there a requirement for an OAIS archive to minimise redundancy? Not defined currently.

CH: discuss definition section, later, after all definitions have been clarified.

Updates to White Book Section 1.6 Definition:

Action Item: DS to update White Book Section 1.6 according to these agreements by July 1
DS/9705/37

Comments to Section 2:
NP: section 2 includes too much detail and page 2-4 is difficult to understand.

DG pointed out that Access Control is not sufficiently clear.

Updates to White Book Section 2:

Action Item: DS to update Section 2. according to these agreements by July 1.
DS/9705/38

In the discussion the question was raised what the minimum set of requirements for OAIS is.

Archival Information Package (AIP):

DS explained, as editor, his understanding of AIP:

CH:

CN agreed with CH's statement.

CH: when you change the format of an image you have to change completely the representation information, but the date remains unchanged

DS: the model needs to be applicable to all implementations

CH: in the same file the image and all the parameters (time, calibration parameters) are contained. They are all content information. Another point related to the example in the white book - calibration - as representation information. In some discipline the calibration is a complex process. So the data is preserved on a low level (engineering values).

LR: content information, as he understands that CH refers to, consists of data object, attributes and representation information, What is content information?

DS: content information is the information you try to preserve

NP gave an example of the FITS files that include several calibration files. If calibration exist the current model of data object and representation information is not sufficient.

LR: content can be to an archive anything but the encoding

GMP:

DS: this could happen.

GMP: you would like to have full control how the data is stored.

DS: the OAIS does not state how to map it into a real archive.

AW: confusion of preservation and representation information. Example: found tape and recognised what mission it came from. What is now representation/preservation information?

DS: the tape bits could be identified as the primary digital object for preservation. Thus the representation intended is that which is needed to understand these bits.

NP: it would be very useful to have at the end of the document several examples about representation information how this maps to different disciplines.

Conclusions on this topic:

Action Item: DS will provide examples on representation information by September 30.
DS/9705/39

Action Item: GMP provides an example about how the ESA EO data is stored, what the metadata is, who is the designated user community, how the data is intended to be used and the representation information useful to exploit the data by June 15.
GMP/9705/40

AW: representation information in the glossary is defined to only refer to digital objects

LR: catalogue information is the preserved part of the object descriptive record

CH:

LR: the physical access can be attained via the access methods.

NP: preservation description information questions: not all data in the field is searchable.

Action Item: In Section 2. a brief overview should be included how user access the data, by July 1. DS/9705/41

Action Item: Move material in section 2.3 before 2.2 by July 1.
DS/9705/42

All panel members agreed that the minutes (this document) will reflect the outline of the next version (July 1) of the White Book document.

Action Item: Update paragraph under Figure 2-5 (on page 2-8) and first paragraph age 2-9 by July 1.
DS/9705/43

LR: federated concept needs to be more explored.

Action Item: LR provide more details in next White Book on levels of interoperability amongst archives and federated archives concept by September 30.
LR/9705/44

LR: any consumer can have different roles: ad-hoc and subscription

Action Item: Change the text in 2.3.2 to reflect more.that they are activities than classes. Make bullets from text, by July 1.
LR/9705/45

DG: the designated community can change

LR: if designated community changes it becomes another archive

DG: preservation information depends on designated community

DS: what are the classifications for designated communities?

AW: for life sciences a new community has been added

LR: if long term preservation over changing community is part of the archive's mandate than this need to be considered at ingest. This is a classification issue.

DS: not issue for OAIS responsibilities

DG: this should be stated explicitly as responsibility

Action Item: Add to 2.4.2 sentence according to the following lines: "determining the designated community needs to include evolution perspectives" DS/9705/46

DG: responsibilities in terms of access control are missing include bullet in 2.4

Action Item: Add to 2.4.6 some sentences about access control by July 1.
DS/9705/47

CN: collection is discussed in 2.4. without prior explanation of the concept

Action Item: 2.4 bullet list remains in section 2 and make rest of 2.4 a complete new section that addresses CN's and DG's concerns.
LR/9705/48

Review of Section 3:

GMP: without indicating what kind of flow it is, the lines do not add any value in Fig.3-1

GMP: it seems that each SIP needs to be reviewed by archival staff. (3.1.2)

AW:

SF: we have PI that take data out of the archive, process it and return it to the archive

LR: many of the function are optional, these functions are the maximum set, the responsibilities are the minimum set

CH: 'error checking' (p 3-4): statistically acceptable assurance is very vague.

LR: checking function for degeneration in the storage is needed. Currently only transfer/migration error checking is done.

CH: report request is not especially clear

DG: support of negotiation during access on format

LR: to select from options, you do not negotiate new one

Changes to Sections 3.1 - 3.1.7:

Action Item: LR to implement changes to Section 3.1 - 3.1.7 as agreed above by July 1
LR/9705/49

LR: 3.1.8 (matrix) will become an annex, since it's main purpose is to find out if there are overlapping sub-functions between functional areas.

Action Item: move 3.1.8. into annex and provide more text by July 1
LR/9705/50

The diagrams in 3.1.9. will be substituted by the SIL/97/P2/N9. Any comments on 3.1.9 are welcome,

GMP: Different font sizes are confusing in 3.1.9

Action Item: clarify different font sizes in 3.1.9.
LR/9705/51

DS stated that fixity is a kind of 'check-sum' information.

AW: catalogue information is not always extracted it can be also created

LR:

type 1:

type 2:

LR: raised the question if this is a design issue or if this should be included in the RM.

CN: stated that this distinction is important, but might be better addressed in a different book, e.g. an associated standard to RM.

CH: we need new terminology to describe these collections, We have defined a graph to the data, and a part of the metadata is made of existing data.

AW: in life science: We archive by missions, they share common hardware and data, but we also create collections.

GMP: at implementation level, one needs to make a decision about the information granularity.

DS: the RM is a check list if all information is provided

CN: pointed out that the time and budget is limited for the definition of the OAIS RM and that it should be better that a reference is given to a futuresstandard should be defined

The conclusion of the panel was that collection types are an important issue that could be addressed in an associated RM standard, but that the level currently contained in the RM is sufficient.

CH: propose to delete catalogue information, since it causes confusion with object descriptive records

A discussion took place about sub-setting and where it is located.

DS: the RM currently supports that the dissemination function supports the processing

GMP:

DS: we need to make clear that we do not exclude sub-setting (extraction) of data

DG; expressed his concern that we get close to designing a system

NP: where are algorithms (e.g. data mining) stored. Without this algorithms no:access to the archive can be made

Action Item: LR to investigate where algorithms are contained in the model, by September 30. LR/9705/52

CH: it is not clear what the catalogue information is needed and what it is

Action Item: LR to clarify either catalogue information or delete it in the RM by September 30.
LR/9705/53

Section 3.2 updates:

Action Item: LR to update section 3.2.1 according to P2 agreements above by July 1.
LR/9705/54

Discussion of SIL/97/P2/N8: DS: everything has representation information and currently in the model it is only shown for data object a general object, the 'information object', is introduced

DG: in Fig 1. to show that Representation information can be an aggregation of 1 or more Representation information is superfluous, since it is already presented by a specialisation of information object

LR: not comfortable with the idea that everything is a physical object. Only the content information should be a physical object. no problem that all objects have a representation information

CN: pointed out that the purpose agreed two days ago was only about the preservation of digital information, with digital metadata and not physical metadata

DS: primary (digital ) objects and secondary objects (e.g. paper documents)

LR: presented OMT models, where.only Representation Information and Data objects were also physical.

DS: in Figure 5/6 'Information Collection' could be changed into Information Unit'.

NP: pointed out that it will be important how the new material will be presented and proposed a approach from 'general' to 'specific' It was noted that 'Unit' seems more low-level than 'Package' and a new name should be found. It was agreed that AIP will be renamed into AIU and AIU will be AIP.

Action Item: LR to change throughout the document AIP into AIU and vice versa by July 1.
LR/9705/55

Updates to Figure 3-9: use OMT methodology make figures more readable move access box on the side where dissemination box is no lines through access - should have defined data flow in and out what are the data objects between consumer and access? 'catalogue metadata' should be changed into 'descriptors'

Action Item: LR to change Figure 3-9 according to above agreements, by July 1.
LR/9705/56

Action Item: Scan CNES and BNSC comments on OAIS Version 8 and address all comments in next version, by July 1.
DS/9705/57

Review of Section 4:

CH described three levels of migration: level 1: move storage objects to other media level 2: change relationship between AIP and object (e.g. compression/not compression) level 3: change representation of storage object

Panel 2 members agreed to only state the migration issues and not their solutions in the RM.

Review of Section 5:

DS: number of classification criteria had been reduced.

NP: all archives to his knowledge fit in the classification criteria.

LR: create a matrix of the current 9 classification criteria.

Action Item: Add text to section 5 to explain why these classifications criteria matter by September 30.
DS/9705/58

LR: next generation of archives

GMP current archives are very static, new archives might be more user oriented in Europe there will be enormous improvement of network capabilities sub-setting would need to be implemented in "Storage" box

Review of Section 6:

NP and MP stated that Fig. X-1 is not easily understandable by non-EO data archives.

LR: 3-3 is the global example and 6 states specific examples two types of scenarios: illustrative and archive scenarios

Action Item: CH/GMP to write a scenario according to the scenario template (US workshop proposed edits) by September 23.
CH/GMP/9705/59

There will be a document control on the White Book, all action items raised during this meeting will be officially responded and traced in the next versions.


Additional Archiving Standards

NP: Submission and Dissemination packages are areas for standardisation

DG: very short standards with respect to time and size

DS: other container standards would need to be investigated for SIP and DIP standards start new standards by Red Book phase of OAIS (after 5-6 months)

LR: discipline standards

Action Item: DS to draft work package for ISO accreditation, by May 22. DS/9705/60

Action Item: DG to submit White Book to ISO as draft standard by May 27
DG/9705/61


Archive Workshop Schedules

The following schedule for the OAIS RM was agreed:
White Book (Version 2) July 1
Comments by agencies August 1
WB (Version 3) September 30
Comments October 15
Review IWS
WB (Version 4) February 1

Concept paper for archiving standards:
Version 1 July 1
Comments July 31
Version 2 August 15
Review at US Workshop September 15
Version 3 October 1


Wider Views

Overview of the Fourth International Workshop
Overview of International Effort


URL: http://ssdoo.gsfc.nasa.gov/nost/isoas/int04/minutes.html

A service of NOST at NSSDC. Access statistics for this web are available. Comments and suggestion are always welcome.

Editor: Nestor Peccia, Bob Stephens, Christiane Nill, David Giaretta
Curator: John Garrett (garrett@ncf.gsfc.nasa.gov) +1.301.441.4169
Responsible Official: Code 633.2 / Don Sawyer (Donald.Sawyer@gsfc.nasa.gov) +1.301.286.2748
Last Revised: (20 June 1997, John Garrett)