NSSDC's Mass Storage System Evolves

Volume 11, Number 1, March 1995
By Jeanne Behnke and Joe King
For somewhat over three years now, NSSDC has been loading newly arriving data and selected data from its archives to its nearline NSSDC Data Archive and Distribution System (NDADS). NDADS consists of two Cygnet 12-inch WORM optical disk jukeboxes with a combined capacity of 1.2 terabytes (TB), hosted by a network-accessible VAX/VMS computer cluster. NDADS has greatly facilitated the accessibility of, and greatly increased the access to and use of, NSSDC data.

Over the past months, it was clear that NSSDC would soon have to expand its mass storage capacity to meet the research community's growing expectation for network accessibility to increasing data volumes. Projections of inflow rates of just astrophysics and space physics data measure in the 1-2 TB/year range for the next few years.

To meet this growing need, NSSDC has just procured a Digital Linear Tape (DLT) jukebox with a total capacity of 2.5 TB of data (with hardware compression activated this amounts to 5 TB of user data). This jukebox is being hosted by a Silicon Graphics (SGI) Indigo2 XL computer running IRIX (SGI's version of UNIX). NSSDC is presently bringing this system to an operational state by evaluating file management software systems to operate the DLT. The principal file management system being evaluated is a UNITREE product. NSSDC has a custom information system that tracks the location of the files in the mass storage systems, initially developed on a Digital Equipment Corporation (DEC) VMS-based platform. The DLT jukebox is being integrated into the NSSDC Storage System (NSS) software during the next few weeks. Plans have also been made to migrate the current VMS-based application to the SGI IRIX environment to allow the NSSDC to more easily integrate additional mass storage devices to be procured in the future.

For some extended duration, NSSDC will operate a heterogeneous mass storage system involving VMS and UNIX/IRIX, and 12-inch WORM optical disks and DLT tape. This heterogeneous system will be called NDADS, the name to be applied to NSSDC's nearline mass storage environment. For most data access paths provided by NSSDC, users will not need to know whether the data they are seeking resides on 12-inch WORM disks or DLT tape.

At the same time that NSSDC procured the DLT jukebox, it also procured a 200-GB magneto-optical disk (rewritable) jukebox. The Hewlett-Packard 200T jukebox holds 144 5.25-inch magneto-optical (MO) cartridges and is hosted on a DEC Alpha 3000/600 OSF/1 workstation. This jukebox is managed by a file management software system from Tracer Technologies. This system will be used to facilitate the ingestion of large data sets to NDADS, where data reorganizing and/or reformatting or the like is required as part of the ingestion process. Following this process, the data will be ingested for long term storage on either the DLT jukebox or the WORM jukebox. NSSDC is considering ways to partition the MO jukebox to use some of its capacity to provide faster network access through NDADS to modest-sized, high-interest data than is possible from either of the NDADS components discussed above.

Although much of the time has been spent bringing the new systems online and integrated into the computing environment, NSSDC has been considering other aspects of nearline data storage including backup techniques, large data distribution, and ingest systems improvements.

