NCSU Libraries Focus Online
Volume 22 number 2 - Winter 2002
New Approaches to Cross-Collection Searching
By Carolyn D. Argentati, Public Services, and Steven P. Morris, Research
and Information Services
The NCSU Libraries is acquiring and producing more and more materials in electronic
format, resulting in the evolution of a rich "digital library" collection.
Elements of the digital library include electronic books and journals, image
and multimedia databases, spatial and numeric data for use with Geographic
Information System (GIS) software, governmental and other World Wide Web sites,
numerous journal and newspaper indexes, and library catalogs. Some of these
digital resources are owned and licensed by the Libraries for its users, while
others reside on remote servers.
At the same time, most library and Internet search tools have not evolved
as rapidly as the information itself has expanded and diversified. Existing
navigation tools frequently require users to access each individual type of
collection separately. This could involve going to five, ten, or more different
Web sites or databases, requiring a significant amount of time and effort to
do a comprehensive search.
Knowledge management systems have begun to provide more powerful forms of
integrated access to the growing digital library universe. They serve as bridges
between a user's information needs and the many kinds of materials in a large
research collection. With one query, knowledge management software simultaneously
searches multiple information resources and formats (for example, books, maps,
and images) and provides the user with one consolidated list of results, a
list that can even be customized according to individual preferences. This
technique is often referred to as cross-collection searching.
The NCSU Libraries has joined other North Carolina agencies in implementing
a new generation of technologies for integrated knowledge management. Like
those agencies, the library has selected the MetaStar product suite from Blue
Angel Technologies. The project will involve preparing descriptive data and
indexing schemes for the collections; capturing, organizing, and mapping data
across multiple formats; and developing search entry portals and protocols
for presenting search results. The knowledge management system will both complement
and incorporate the traditional library catalog.
Accommodating Different Methods of Resource Description
One advantage of knowledge management systems over traditional library catalogs
is their ability to accommodate different methods and levels of description.
Information resources often have different description needs. For example,
GIS data users require descriptive data to find the answers to such questions
as: What source materials were data derived from? What processing steps were
undertaken in preparing the data? What is the spatial extent of the data? What
format is the data in? This descriptive data, or metadata, may be collected,
organized, and stored within the knowledge management system according to a
standard created by the Federal Geographic Data Committee specifically for
geospatial data.
Other types of digital information may have different descriptive requirements.
For example, the social sciences data community makes use of the Data Documentation
Initiative standard for documenting numeric data resources used in the social
sciences. The Data Documentation Initiative standard incorporates content not
found in other metadata standards, including, in the case of survey data, detailed
information about the survey questions from which the data originates. Other
metadata standards used by the Libraries include Dublin Core, for general description
of a broad range of information resources, and Encoded Archival Description,
a standard for encoding archival finding aids.
Creating Different "Information Spaces"
Maintaining separate databases for different types of information resources
also facilitates the creation of distinctive information spaces, allowing users
to formulate targeted searches against specific information resource collections.
Users can search a number of databases and then, based on scanning and comparing
the results, focus on one or more databases for further searching. Separating
descriptive data into separate spaces also makes it possible to provide descriptions
based on the component parts of any given resource. For example, a collection
of digital aerial photos, in which there could be as many as 2,000 images,
might be represented by a single entry in the library catalog. At the same
time, in a specialized database of geospatial data resources, there might be
catalog records for each individual image to accommodate retrieval of images
by location, place, or other user-defined criteria.
Creating Gateways to Databases
MetaStar's gateway component permits the creation of a single interface to
a database collection, the composition of which may be customized to suit the
needs of a particular user population. For example, an agriculture gateway
might allow users to select from and search against a set of agriculture-related
databases. MetaStar's gateway capability, combined with its harvesting (extracting
metadata from a Web-based document), also makes it possible to integrate and
draw information from relevant databases available from other institutions
such as universities and state agencies.
Next Steps
Staff members at the NCSU Libraries are currently engaged in testing and prototype
development with the Blue Angel MetaStar software. Several prototype systems
involving spatial and numeric data and cross-collection searching in various
subject areas are expected to be in place by early 2002. For further information,
please send an electronic-mail message to Steve Morris (Data Services) at steven_morris@ncsu.edu.
For more information about Blue Angel Technologies and MetaStar, visit its
Web site at http://www.blueangeltech.com/.
|