ISO Archiving Standards - Second International Workshop - Minutes
NASA/JPL
Pasadena Hilton
Pasadena, CA, USA
29-30 April 1996
(NOTE: We invite all participants to critique these minutes and
to offer updates on any significant points they feel are missing or
inadequately reflected.)
The Second International Workshop on ISO Archiving Standards was hosted
by NASA/JPL at the Pasadena Hilton in Pasadena, CA, USA.
on 29-30 April 1996.
Table of Contents
Participants
Status to Date
French Archiving Workshop
Towards a Metadata Model
Archive Information Services Reference Model
Catalogue Interoperability Protocol
Discussion Points on the AISRM
Purpose and Scope of the AISRM
Table of Contents of the AISRM
Discussion on the Definition of an "Archive"
Conclusions
Planning of 1996 Meetings
CEOS Liaison Report
WP700 Status Report
List of Registered Documents
Action Item List
Don Sawyer distributed a meeting agenda [Item 1], available at
http://www.nasa.gov/nost/isoas/int02/agenda.html,
which was accepted and followed during the meeting.
The list of attendees is available at:
http://www.nasa.gov/nost/isoas/int02/participants.html
The following were present:
| Affiliation | Name | Initials
|
|---|
| BNSC
| David Giaretta
| DG
|
| BNSC
| Steve Fisher
| SF
|
| BNSC
| Wyn Cudlip
| WC
|
| CNES
| Patrick Mazal
| PM
|
| CNES
| Claude Huc
| CH
|
| ESA
| Nestor Peccia
| NP
|
| ESA
| Christiane Nill
| CN
|
| ESA
| Steve Smith
| SS
|
| DLR
| Martin Pilgram
| MP
|
| DLR
| Manfred Drexler
| MD
|
| NASA
| Don Sawyer
| DS
|
| NASA
| Lou Reich
| LR
|
| NASA
| John Garrett
| JGG
|
| NASA
| Mike Martin
| MM
|
| NASA
| Randy Davis
| RD
|
| NASA
| Alan Wood
| AW
|
| Carleton University
| Ekow Otoo
| EO
|
| Mitsubishi
| Junichi Oshima
| JO
|
| NWAD
| Russ Cryder
| RC
|
| NSPO
| Jun-ji Lee
| JL
|
| SAC/CSIR
| Michele Le Saux
| MLS
|
Status to Date (DS)
DS presented the results of the archiving task to date. This included:
- the definition of terminology
- an initial archiving reference model
- a definition of the scope (which was clarified to include all domains, not
just space data)
- a list of scenarios (of which all have not been produced)
The major activities to date were listed, including the various workshops at
each agency and the dates of issue of the Reference Model paper and related
scenarios.
French Archiving Workshop (PM)
PM presented a summary of this workshop (21/22 March 1996), but most of the
papers were in French.
The aim of the workshop was:
- to detect the real need in the domain of data archiving (generic vs.
specific needs)
- evaluate how the archive problem is perceived
- learn from others experience
- enroll new people in the ISO effort
52 people attended, including national organisations and industry. Also
representatives from the library services, electricity services and defense
industry were present.
23 presentations were organised in four sessions as follows:
- Requirements (4) where four domains were identified:
- preservation of scientific data and associated info (metadata) needed in all research organisation
- preservation of technical data (drawings, CAD, specific documents) around standards like STEP, SGML
- national heritage concerns (French National Library)
- geographical data preservation with operational access constraints specific to defense domains
- global archiving (8)
- storage and physical preservation of data (6)
- data structure and metadata (5)
The main results are as follows:
- interest of IEEE Reference Model (autonomy between boxes)
- Common Storage Approach (preservation of digital data on various media)
i.e.,
- problem of changing media with time
- general interest to manage the physical medium independently of the data
- The boundary between data and metadata is not the same for everybody
- The POSC (Petroleum Open Software Corporation) has produced a model called
EPICENTRE using EXPRESS
- The National Storage Library is using on two projects the following:
- NSL-Unitree: HSM software
- HPSS (High Performance Storage System)
- Some participants considered too ambitious to study the normalisation of
all archiving activities, as specialists are required in every domain.
- No commitment to an ISO activity but most participants ready to
participate in another workshop to exchange experiences again in a year or
two.
Towards a Metadata Model (CH)
New problems associated with the use of digital data:
- Disappearance of analogue information
- Rapidly increasing volumes of information
- Incompatibility of the lifetime of the media and the need for long term
preservation
The information must be disassociated from the physical medium:
- Essential difference between analogue and digital information
- The digital information can be represented in an abstract manner
The concept of collections, digital objects and groups were presented. Groups
consist only of collection(s), collections consist only of digital object(s)
and digital objects relate (or point) to storage object(s).
The search process:
- the search process is actually a navigation within the metadata graph
- the objects described at the graph nodes are the metadata associated with
the node
- a general classification, not-domain dependent can be perhaps defined:
- active/passive metadata
- the definition of a new abstract class of objects
A number of object diagrams were presented that modeled the concepts.
Archive Information Services Reference Model (DS/LR)
DS and LR presented the current version (0.4) of the AISRM.
Catalogue Interoperability Protocol (SS)
SS presented the current version (1.2) of the CIP.
Discussion Points on the AISRM
Each person around the room gave their individual comments on the document:
- CH
- Section 2: Information Representation is needed but at present
is not very clear
- Figure 4.2: Data Administration to Dissemination link needed
- Look more closely at functions in each entity before settling on
all the links
- Section 5: Physical model is not clear; data store;
- Find common terminology on main topics - a priority
- PM
- No further comments to those of CH
- NP
- Clear definition of the scope is required
- Applicability: who is the intended audience and users of the paper
- Terminology: should be all clearly defined
- There is a discontinuity between the Functional Entity and Data Model
- Scenarios: should reflect the disciplines under applicability
- There should be scenarios showing
ground systems/archives/processing levels/etc
- SS
- Purpose and Scope: unclear
- modeling archive?
- defining services?i
- both?
- Section 2:
- Early is good but too long and wanders; needs to be more focused
- Information representation material is confusing
- Archive Information Management is confusing - what is the point?
- Description of the archive needs more figures to make more comprehensible
- Section 5: seems to interrupt the flow from section 3-4-6
- The need of the Data Model (as it currently is presented) is questioned
- Section 6: better but depends on the scope and needs more explanatory text
and introductory material
- CN
- Agreed with NP and SS comments and didn't want to repeat them
- Scope/terminology: needs clarifying
- Move section 2 to later in
the document (dependent on scope, TOC of ### documents should be reused)
- Provide references: not only in terms of a list of references but also so that
we look at and reuse similar work being done in other areas and groups
- Point out the domain specific views as opposed to the current global model
that is presented and the features that may have to change/be added for a
particular domain
- EO
- Terminology: Document vs. Observation Data not clear, perhaps raw
vs. value added would be better
- On-line vs. active vs. long-term archives classification is required
- Access and dissemination is not clear, perhaps they should be together
- Advertising service seems to suggest "broadcasting"
- Figure 4-2: are we excluding a DBMS from storage?
could it support both storage and data management
- MM
- Section 3.3: management is very short
- Version 0.4 is much more readable - primary importance for large
customer community
- Section 4.0: needs more text, bullets are rather terse, may require
pictures. balance with section 6.
- RD
- Scope: felt we should limit our scope to the digital domain.
make it clear that we preserve digital pointers to analogue objects
which are not reproducible
- Document vs. observation data is very unclear
- Section 2: much more formal presentation of data, metadata, representation,
migration, etc. is required
- Section 4: make this layered such as:
- services
- administration
- data management
- ingest, processing (thinks that this should be in), dissemination, storage
- Section 5: may be too specific, i.e., Z39.50 models, DOs, DIs, etc.
- MLS
- Section 6: maybe a top level context diagram is required at the
environment level
- Dissemination: pull out processing into a separate entity
- Discuss how other activities in data centres may relate to AISRM
- DG
- Top level external services view is required
- Do we need to go to the lower levels? yes
- Archives talk to each other differently
- control data
- mass transfer of data
- Top down breakdown should provide a more readable/understandable document
- needs to be motivated by specific needs
- If other things come up, e.g., data model and metrics, we need good
justification for putting them in
- WC
- The spectrum of going from documents to observational data should be
clearer and explored further
- We should use the current Panel 2 terminology where appropriate but
not restricted by it if it's not applicable
- Processing is important and needs
to be explored further even if it doesn't end up being a separate box
- We need to explore the similarities between a ground segment model
and an archive model (perhaps it should be in an annex)
- The context diagrams use a formal methodology (similar to Yourdon),
but the methodology is not described
- AW
- Scope and purpose needs to be fleshed out
- We don't know what an archive is yet? We need a page or two which describes what an archive is and what it
is not, i.e., that fact that it is long or indefinite term
- uncomfortable with
the "reproducible" only data scope - this does not fit for life sciences
- Section 7 and 8: it looks like 80% is there but section 7 and 8 need
completing feels that
the range from Document to Observation Data rings true and the completion of
sections 7 and 8 may help to show this
- They need a reference model for the
archive for the International Space Station
- RC
- Scope: should be digital only, but considering pointers to physical objects
- Support from vendors means we need broad domain
- Need to recognise the broad domain as it impacts the terminology
- Need to work on different model views
- LR
- Agrees with a lot of what was said on the purpose and scope
- We need to agree on the following in this meeting:
- Purpose and Scope
- Agree on which comments received should be addressed
- Look at DCP, CIP and CH model and come to a common model
- Need to add to section 7 or 8 a classification of archives
- DS
- agrees with a lot of the comments, particularly the following must be
in the document:
- We need to support both reproducible and non-reproducible data
- Concerned about breaking out too man boxes in the functional model
- MD
- Scope is a key issue and the context and the document within
ISO/CCSDS needs to be clear
- Particularly has a problem with section 2. Not sure if it should be in the
document at all
- Thinks the processing should not be included in the model as
it is not manageable. An archive to MD, is a stable static entity
Additional comments received from elsewhere were discussed:
- Joe King
- definitions up front are too abstract - not friendly to the reader -
suggests adding pointers to examples for clarity
- Section 2: needs examples; map to more familiar terminology
- Raises a question that seems to ask if emphasis on understanding
representations goes
as far as standards for documentation by which low level data can be turned
into higher level data
- Asks question if an archive understanding emphasis allows scenario of
external group certifying high level representations are understandable and
archive only has to understand lower level representations
- Jim Thieman
- Would like to see scaling and metrics soon - interesting
- Wants to see more examples in Section 2 material
- Section 2.3: on documenting representations used when they are not longer
supported by common systems, he asks "Is documentation sufficient for
understanding?"
- Section 3.0: on environmental view, he asks if people (staff) are in the
archive.
Purpose and Scope of AISRM
After discussion the purpose and scope was defined to be:
"The purpose of this document is to define the ISO Reference Model for an
Archival Information System (AIS) which provides a framework for understanding
archival concepts. It defines archival concepts, common terminology,
functions, services, external and internal interfaces. This allows the
architectures and operations of existing and future archival systems to be
described and compared.
The AIS model provides a conceptual and functional framework within which
independent teams of experts may proceed with detailed services and
architecture definition.
The concepts and operations described in this model are intended to apply to
digital information. However non-digital information, e.g. physical samples,
may also be archived within the framework defined by this model."
Table of Contents of AISRM
After much discussion, the following was agreed as the table of contents for
the next version of the AISRM:
- Introduction
- Terminology
- Other CCSDS boilerplate
- Archive Roles and Concepts
- What is an Archive - 1 page (new)
- Archive, Roles and Concepts (old section 2.1)
- Characteristics (old section 2.4)
- Functional model
- 3.1 Archival Information System External View
- Text and diagram from (old section 3)
- Context diagram (new)
- 3.2 Decomposition of the Archive Information System
- (old section 4) + context diagram per functional entity (old section 6)
- Information Model
- 4.1 Logical Model of an Archival Information (old section 5.1(maybe!))
- Identify the pieces of information managed by an archival information system
- Model the relationships between these identified pieces of information
- 4.2 Information Transformations
- 4.3 Representation of Information (??? title ???)
- old section 2.2 and 2.3 plus consider "Logical Model" paper
- Clearly define the purpose of including this section
- Hypothetical Archival Scenario
- New material to present common archive functions using the functions, services, and data views described
- Archive Classification, Scaling and Metrics
- New material on types of archives
- Metrics of archive performance
- Mission critical - don't lose any of the data
- Mission success - usability from the providers and end users points of view
- Annex A: Current system scenarios
- Appendix B: Long/Detailed Representation Example (Figure 2-X)
Discussion on the Definition of an "Archive"
There were three different figures presented to show what an archive was in
the options of the authors, these were:
- From CN: [Figure]
- From DS: [Figure]
- From LR: [Figure]
Conclusions
It was agreed that LR should be the official editor of the AISRM. The proposed
time scale for document production is:
| Document
| Date
| Responsibility
|
| Version 0.5 of the AISRM
| 7 July 1996
| LR
|
| Comments on version 0.5
| 15 August 1996, As resources permit
| All
|
| Version 0.6 of the AISRM
| 1 September 1996
| LR
|
| Comments on version 0.6
| 1 October 1996
| All
|
| Concept paper on Formal Notation Representation Issues
| 7 July 1996
| RD
|
| Concept paper on "Document" vs. "Observable Data"
| 20 August 1996
| DS
|
| Concept paper on "Information Representation Model"
| 20 August 1996
| DS
|
Planning of 1996 Meetings
The next Panel 2 workshop will be hosted by DLR, Munich, Germany on the dates
of 4-5 November 1996 (Monday and Tuesday).
CEOS Liaison Report (WC)
WC presented the activities within the CEOS body. This included the new
reorganisation of the CEOS and the main achievements of the CEOS.
There was extensive discussion on whether the CIP could be brought under the
CCSDS Panel 2 so that it may be formally standardised and eventually made an
ISO standard. Everyone agreed that this was a good idea and should be
investigated further. DS was concerned that the CIP may be perceived as only
for EO data and that the concepts were in fact generic to many disciplines,
only the attribute and elements sets making it specialised for the EO domain.
DS suggested that an appropriate cover letter could solve this problem. The
panel generally agreed that it would be a good idea to have the CIP a CCSDS
standard and if the reviewers accept it then it should be done.
DG asked if the ADID could be used to register the attribute sets. SS said
that the ADID could contain the ISO Object Identifier (OID) which is where the
actual attribute set is registered.
It was agreed that CN as chairman of the CEOS Protocol Task Team will produce
a cover letter than could accompany the CIP for if it was reviewed outside the
panel. This cover letter would be circulated to the panel to see if it put the
CIP in the correct context of Panel 2 and the general space data field.
(CN/960515)
WP700 Status Report
The following status report was prepared for CCSDS Panel 2 prior to this
workshop.
ARCHIVING PROGRESS REPORT
1996-04-29
The active work package in Archiving is WP710.
WP710 Archiving Reference Model
A. Progress
NASA held two US archiving workshops since the first ISO/CCSDS archiving
session at RAL. Results of the RAL workshop and US workshops have been put on
the WEB. CNES held a French archiving workshop and will report on this in
Pasadena. Three sets of comments on version 3 of the Archiving Reference
model concept paper have been received.
The fourth version of the Archiving Reference model concept paper has been
produced and is a major topic for discussion. Two metadata papers were
generated by Claude Huc and they will also be a subject for comment in
Pasadena.
B. Changes
No significant changes in the definition of the activity have been identified.
C. Problems
No particular problems are noted. Other agencies are urged to broaden their
participation if possible.
D. Forecast
The effort appears to be progressing well. It is hoped to get agreement on a
White Book outline and assign an editor (Lou Reich) in Pasadena.
E. Milestone Table
None Scheduled.
_____________________________________________________________________________
Milestone Table for WP700
_____________________________________________________________________________
Management Plan Projected
Completion Completion
WP # Description Date Status Date
_____________________________________________________________________________
710.2 Draft Archiving Reference Model 96.03.31 Closed 96.04.24
_____________________________________________________________________________
List of Registered Documents
MATERIALS DISTRIBUTED/REFERENCED
(Item/Author/Distributed By)
- Draft Agenda / Sawyer / Sawyer
- Reference Model for Archival Information Standards, Version 4 / Reich, Sawyer / Reich
ABSTRACT: This is version 4 of this concept paper. Its coherence is
significantly improved from version 3, but still needs much work. There are
many new concepts that have not been reviewed. Sections 6-8 are old material
or TBD and should not be reviewed in detail. There is much new material in
section 1-5.
- Catalogue Interoperability Protocol (CIP) Specification - Release A, / Smith for CEOS Protocol Task Team / Smith
ABSTRACT: The CIP is a Z39.50 based protocol that is designed for the search,
access and retrieval of Earth observation data, although it is applicable to
many archive retrieval scenarios. This concepts and specification in this
document are considered very relevant to the archiving task, especially in the
area of archive access and retrieval.
Copy available from:
ftp://styx.esrin.esa.it/pub/od/CIP/cip_release_a/cip_release_a_1.2/cip-
a12.pdf,.ps,.ps.gz
- Preliminary classification of metadata - Proposal January 1996 / Huc / Huc
ABSTRACT: Critical analysis of existing terminology and concepts followed by a
way of modeling metadata
- Towards a metadata model - April 1996 / Huc / Huc
ABSTRACT: (Claude HUC January 1996) , of the comments made by Don Sawyer on it
and elements in Version 3 of the Reference Model.
- Report on French Archiving Workshop / Mazal / Mazal
Action Items
______________________________________________________________________________
ACTIONS OPEN FROM PREVIOUS MEETINGS
______________________________________________________________________________
AI # WP # Description Act Date Status
______________________________________________________________________________
P/9510/45 700 Generate scenarios for several types of
organisations that satisfy, and do not
satisfy, the "archive " definition:
Nestor Peccia Science Data Center NP 960630 Closed
Grant Denkinson GENIE BNSC 960630 Open
Lou Reich DAACS/ECS LR 960630 Open
Matthew Wild Cluster BNSC 960630 Open
Chunky Lepine NERC ACSOE Archive BNSC 960630 Open
Don Sawyer NSSDC DS 960630 Open
Matthew Wild WDC's BNSC 960630 Open
Wyn Cudlip IDN/Global Change Master BNSC 960630 Open
Directory
John Turner Life Science Data Archive BNSC 960630 Open
David Giaretta Rutherford Atlas Data Store DG 960630 Open
Yasunori Iwana NASDA archives TACC/EOC YI 960630 Closed
Claude Huc CNES archive (Metadata only)CH 960630 Closed
________________________________________________________________________________
____________________________________________________________________________
ACTIONS OPENED AT LAST PLENARY MEETING (or later)
____________________________________________________________________________
P/9604/28 700 Issue 0.5 of the AISRM LR 960707 Open
P/9604/29 700 Provide comments on version 0.5 of
the AISRM All 960815 Open
P/9604/30 700 Issue version 0.6 of the AISRM LR 960901 Open
P/9604/31 700 Provide comments on version 0.6 of
the AISRM All 961001 Open
P/9604/32 700 Issue Concept paper on Formal Notation
Representation Issues RD 960707 Open
P/9604/33 700 Issue concept paper on "Document" vs.
"Observable Data" DS 960820 Open
P/9604/34 700 Issue concept paper on "Information
Representation Model" DS 960820 Open
P/9604/35 700 Produce cover letter to accompany CIP spec
with panel/external review CN 960515 Open
URL: http://ssdoo.gsfc.nasa.gov/nost/isoas/int02/minutes.html
A service of
NOST at
NSSDC.
Access statistics for this web are available.
Comments and suggestion are always welcome.
Editor: Steve Smith (stsmith@esoc.esa.de) +49.6151.902.816
Curator: John Garrett (garrett@ncf.gsfc.nasa.gov) +1.301.441.4169
Responsible Official: Code 633.2 / Don Sawyer (sawyer@ncf.gsfc.nasa.gov) +1.301.286.2748
Last Revised:5 May, 1996, Steve Smith (14 April 1998, John Garrett)