Digital Archive Directions (DADs) Workshop

(A part of the ISO Archiving Workshop Series)
 
 
  

     

Position Paper


Digital Archive Directions (DADs) Workshop

DATE: June 22-26, 1998

HOST: The National Archives and Records Administration
Archives II
8601 Adelphi Road
College Park, MD 20740-6001

 


 

1. Identification of Proposed Topic [Required]

We have a new project in the UK funded by the Joint Informations Systems Committee of the Higher Education Funding Council. The project is funded for 3 years for a total of $500,000 approx, with a further $150,000 input from the participating institutions.

1.1 Title

CEDARS: A multi-site UK project to create exemplars in Digital Archiving

1.2 Contributor(s)

The Consortium of University Research Libraries

10 Portugal Street,
London WC2A 2HD, UK
Tel: +44-171-955 6314
The project manager is:
Kelly Russell <K.L.Russell@leeds.ac.uk>
Edward Boyle Library
Leeds University
LEEDS, LS2 9JT, UK
phone: +44-113-233-6386

The project director is:
Clare Jenkins<c.jenkins@lse.ac.uk>
London School of Economics
Houghton Street, London WC2A 2AE, UK
+44-171-955-6314
 
The author of this paper and POINT OF CONTACT is:

David Holdsworth<D.Holdsworth@leeds.ac.uk>
Computing Service
Leeds University
LEEDS, LS2 9JT, UK
phone: +44-113-233-5402

1.3 Description of Proposed Project

The projectís start date was 1st April 1998. The three lead sites have each got digial archive systems installed and running. The project is very much a collaboration between the user community (i.e. the world of libraries) and the providers (i.e. the computer centres).

The project aims to generate demonstrator projects at each of the three sites dealing with different aspects of digital preservation. However, the expectation is that this will have an element of integration so that the abstract view is of a single archive in which information happens to be stored in different locations. This will include scope for duplication of material at more than one site.

At this stage we are very much a melting pot of ideas. An early part of the work will involve generation of an abstract model. It is here that we desire maximum alignment with emerging standards (see 1.6 below). In broad terms the work divides:

  • collections management focussed on Cambridge,
  • meta-data focussed on Oxford, and
  • storage issues focussed at Leeds.
  • We would expect to have an initial version of our model to present at the workshop, and also to be able to describe the overview of the project

    The providers.

    The universities of Cambridge, Leeds and Oxford run different HSMs, see section 2.2 below.

    The User Community

    The Consortium of University Research Libraries (CURL) includes all the major research libraries in UK academia. In addition to the projectís lead sites it includes the British Library and the Arts and Humanities Data Service ( http://www.ahds.ac.uk/). All CURL libraries are members of RLG.

    1.4 Justification

    The project arises out of a recognition by the UKís academic funding bodies and legal-deposit libraries that the indefinte storage of digital material is very haphazard in the UK, and falls far short of the standards for conventional material that has been standard practice for decades. The desire for indefinitely long time secure storage poses problems of shifting formats and shifting storage media technology.

    1.5 Definitions of Concepts and Special Terms

    Legal Deposit Libraries in the UK have a requirement to store all printed material published in the United Kingdom, and the publishers have a legal obligation to provide the libraries with copies. There are 3 such libraries - the British Library and the libraries of the universities of Oxford and Cambridge.

    The Joint Informations Systems Committee is a part of the UKís university funding mechanism which has a special responsibility for the IT infrastructure of UK academia.

    1.6 Expected Relationship with OAIS Reference Model

    It is a condition of the projectís sponsor that maximum notice is taken of activity elsewhere on a global scale. The OAIS model is seen as embodying necessary concepts to our work.

    The CEDARS project expects to produce demonstrators which largely give a vertical slice through the OAIS model, demonstrating the process through Ingest, Storage and Dissemination. It is likely that some aspects such as collection management and access control will only be covered by recommendations. The participants already have experience of long-term data storage, as exemplified by my paper to the 1996 Goddard MSS conference [Holdsworth 1996, http://esdis.gsfc.nasa.gov/msst/conf1996/A6_07Holdsworth.html]

    It is intended that the outcome can be summarised as a strategy for long-term digital storage for the UK, including significant demonstrators which may be exandable to full-scale operation.

    We expect that we will be addressing the folowing elements of your list:

  • the interfaces between OAIS type archives
  • the submission (ingest) of digital data sources to the archive
  • the delivery of digital sources from the archive
  • the submission of digital metadata, i.e., data about digital or physical data sources, to the archive
  • the identification of digital data within the archive
  • protocol standard(s) to search and retrieve archive metadata information
  • the migration of information across media and representations
  • recommended archival practices and accreditation of archives
  •  


     

    2. Scope of Proposed Standard [Desired]

    2.1 Recommended Scope of Standard

    There is a Web site at:
    http://www.curl.ac.uk/cedarsinfo.shtml
    in which there is the full project proposal.

    2.2 Existing Practice in Area of Proposed Standard

    The Leeds University site has been using some form of hierarchical storage management since 1968. The current practice involves an in-house system the LEEDS Archive ( = Low-cost Everlasting Extensible Data Store) see:
    http://www.leeds.ac.uk/ucs/systems/archive/

    This system and the developments which led to it is described in [Holdsworth 96]. The design explicitly includes migration to new media types, and is based on the IEEE-CS MSS model version 4. (It predates version 5).

    The Cambridge system has an in-house access sytem via FTP, and uses the EPOCH HSM as the data store.

    The Oxford system is IBMís ADSM.

    The Arts and Humanities Data Service already has experience in dealing with digital data in the humanities, and brings a wealth of expertise to the project.

    2.3 Expected Stability of Proposed Standard with Respect to Current and Potential Technological Advances

    The system at Leeds University was developed in-house (starting early 1992) because we were not confident in the long-term capabilities of commercial products available at the time. We believe that the situation today is little different, and we have 5 (and maybe soon 6) customers for our system which may indicate that others agree with us. The systemís architecture envisages the smooth integration of new technology storage hardware, and the graceful decommissioning of older equipment; all as part of normal operations. We have a rule of thumb that we look to make improvements in capacity of at least a factor of 10 in adopting new storage media.


    Wider Views

    Overview of the DADs Workshop
    Overview of US Effort
    Overview of International Effort


    URL: http://ssdoo.gsfc.nasa.gov/nost/isoas/dads/DADSbase.html

    A service of NOST at NSSDC. Access statistics for this web are available. Comments and suggestion are always welcome.

    Author: David Holdsworth ( D.Holdsworth@leeds.ac.uk) +44-113-233-5402
    Curator: John Garrett (John.Garrett@gsfc.nasa.gov) +1.301.286.3575
    Responsible Official: Code 633.2 / Don Sawyer (Donald.Sawyer@gsfc.nasa.gov) +1.301.286.2748
    Last Revised: May , 1998, David Holdsworth (May 26, 1998, John Garrett)