Science Archives in the 21st Century
Topic: Meeting User Needs: We describe some of the challenges associated with providing access to diverse planetary data sets, and present a solution now in development at the Rings Node of the Planetary Data System.
The Planetary Data System (PDS) archives data from numerous spacecraft, telescopes, and instruments. We handle many types of data: images, spectra, cubes, time series, tables, etc. We support diverse users, who might wish to access the same data products for entirely different purposes. The PDS was designed as a distributed archive so that each user community has a primary point of contact, enabling users to access their discipline's key data sets via the most relevant selection criteria.
Each data set is delivered to the PDS with its own unique collection of descriptive information or "metadata". Some concepts are common to all data products (e.g., observation time), some are common to specific data types (e.g., filter names for images), and some are unique to one instrument. Additional types of metadata (e.g., viewing and lighting geometry) are different for each discipline.
Users weaned on modern Internet search engines have high (often unrealistic) expectations of what NASA's archives can deliver. Nevertheless, as NASA's data sets grow, users legitimately face the "needle-in-a-haystack" problem of zeroing in on the finite set of data products most relevant to a particular scientific question. They expect to be able to query the archive using a rich set of reliable metadata, without necessarily knowing in advance which data set(s) might hold the answers, or even whether an answer can be found within the archive's holdings.
The Rings Node is developing a web-based search engine that appears to be capable of meeting many of these challenges. The user can start a query at a very high level (e.g., images of Saturn) and "drill down" through all the available holdings to reach the desired data products (e.g., infrared images of Saturn's F ring at low phase angle and resolution finer than 5 km per pixel). The interface responds immediately to each new constraint entered by the user, removing irrelevant options and adding new ones as appropriate. For example, if it is determined that only Cassini images fit the constraints entered so far, then a page of Cassini-related constraints becomes available but options for occultation profiles and Hubble data are hidden. At each step, the engine displays a live tally of how many available products match the user's constraints. Using Ajax technology, caching of previous results, and a highly optimized database, the system is quick and responsive in spite of the millions of database records that must be searched.
However, we are only beginning to address a critical component of this system-- -generating the geometric parameters that are a fundamental part of most searches. We have identified the need to generate all such metadata in-house, via SPICE tools now in development. Unfortunately, we have found that this critical information is rarely included by data providers in their submission information packages.