By Don Sawyer and Rick Burley
The IMAGE project elected to use NSSDC's archival storage, preservation, and distribution facilities to support the science communities need for long term access to IMAGE's science data products. The "Level-0.5" products are mostly organized as sets of daily, instrument-specific Universal Data Format (UDF) files bound together as daily tar files. A higher level product in Common Data Format is also provided to NSSDC for CDAWeb access. While IMAGE was preparing its ground processing system, NSSDC was in the process of upgrading its systems to work with Archival Information Packages (AIPs). This presented an opportunity to test the hypothesis that projects could, if given proper tools, significantly improve the ability of archives to ingest and manage project data products for long term preservation and access without adding a significant burden to the projects. As the IMAGE project was already planning to run a script that would push files into an NSSDC staging area on a daily basis, it was decided to provide the project with a set of software that would convert the "Level-0.5" data files into NSSDC-conforming AIPs. This required an upgrade to the Data Migrator Utility software, called the Package Generator Utility (PGU), and a port to DEC Alpha. It also required that a number of attributes, previously generated in the NSSDC environment, be generated in the IMAGE Science and Mission Operations Center (SMOC) environment. This was accomplished by including a number of configuration files to set key values from which the NSSDC attribute object and packaging information could be derived. Since each AIP has a unique Archival Storage Identifier, under the control of NSSDC, it was necessary to develop a scheme to assign these Identifiers remotely. This was accomplished by configuring this Identifier with a project specific prefix, as set in a configuration file, followed by a sequence number maintained in another file. In this way unique ASIDs are generated automatically as needed whenever the PGU software is executed. Key data flows are shown in Figure 1.
The IMAGE SMOC runs a master script to generate science data products and copy certain products into an input staging area (nssdcin) on an Alpha UNIX, and then it invokes the Package Generator Utility (PGU). Input to the PGU is a pointer to this staging area, plus the same inputs as with the DMU. The valid source filenames are used to generate an input listfile, which is in turn used to generate resultant AIPs into a target staging area (nssdcout) on the Alpha. Then, if all source files have been processed, they are removed by the PGU from the input staging area, leaving the AIPs and accompanying log files in the target staging area.
In more detail, this is accomplished by the PGU with five processes named packgen, makelist, makepack, fileget, and cleanup. Functionally, the main program, named packgen, starts via Telnet or a Shell Script and manages the entire AIP creating process. The migrator starts the makelist process, which generates a listfile containing information to continue the AIP creation process. Then the makepack process collects the information necessary to generate AIPs. In turn, makepack starts the fileget process that accesses UNIX file and record information. Finally, the cleanup process reviews the resultant AIPs and status information to determine if all AIPs were created, and at this time it removes the source files from nssdcin.
When the IMAGE script detects that the input source file directory is empty, it proceeds by creating a manifest of all the files in the target staging area. It then pushes the AIPs, output log files, diagnostic file, and manifest file to an NSSDC input staging area.
Periodically NSSDC checks its input staging area and moves the newly arrived files to a processing area. Currently NSSDC operations staff run the NOST provided Package Splitter Utility to put the attribute objects and tar files into appropriate directories. When the DIOnAS system upgrade is completed, as shown in Figure 1, it will run the NOST developed Extractor Utility to inventory the AIPs (which might be a subset of those actually created in the IMAGE run) and to generate a table of values used by the DIOnAS database. The AIPs will be stored on DLTs, and split automatically.
Users can access the NSSDC anonymous ftp site to obtain the UDF tar files and associated attribute objects as required. In the future, they will be able to access instrument-specific UDFs with software provided by C. Gurgiolo of the IMAGE team that will be running on our lewes machine.