Skip to Quick Links BarSkip to Page Content
NCSU Libraries
Search the Collection
Browse Subjects
Services
Library Information
Community
News & Events
News/Events
Get Answers Now

NCSU Libraries Focus Online

Volume 22 number 2 - Winter 2002

First Temple of the Atom Electronic Text Project: Increasing Access to Special Collections in the Digital Age

By Russell S. Koonts, Special Collections Department, and James Jackson Sanborn, Research and Information Services and Digital Library Initiatives

". . . at 59 minutes past midnight in the early morning hours of September 5, 1953, the Raleigh Research Reactor breathed with nuclear life for the first time. . . . For 51 months--four years and 12 weeks--the world's first college-owned nuclear reactor was in the making, evolving from a dream through negotiations, design, and construction to initial operation. . . . The N.C. State nuclear reactor was (1) the first to be used entirely for peacetime training and research, (2) the first to be operated on any college campus as a non-AEC reactor, (3) the first to be open for public inspection with visitors welcomed. . . ."

--from First Temple of the Atom, NC State School of Engineering, ca. late-1950s

In July 2000 the NCSU Libraries' Special Collections Department established an electronic texts program to provide increased access to unique and interesting items from its collections. The first priority required choosing a project small enough to complete successfully, but varied enough to allow experimentation with tools and procedures at each stage of the process and to establish program standards. Work on a Web site detailing the history of the first Raleigh Research Reactor at North Carolina State College (operational from 1953 to 1955) had been conducted earlier in the year. In light of this, the department decided that the first electronic text project should focus on key historical and archival documents relating to the establishment of the reactor program. The intent was to add value to the Web project and to increase awareness of the unique materials housed in Special Collections.

Throughout the summer of 1999, staff members from Special Collections and Digital Library Initiatives reviewed boxes of historical documents from the College of Engineering, Office of the Dean, Department of Nuclear Engineering, and Engineering Communication record groups. The survey identified for potential inclusion nearly 250 documents that trace the history of the reactor from its initial conception, through its creation, to the nuclear accident that caused it to be decommissioned.

Among the papers identified for possible inclusion were many that related to NC State faculty members who founded the reactor project: Clifford A. Beck, A. C. Menius, Newton Underwood, Arthur Waltner, and Raymond Murray. Murray further assisted library staff by providing context for the reactor program and the importance of key documents. Additionally, he had donated his "Reactor Notebook" to the University Archives several years earlier. The ninety-four-item notebook--consisting of memoranda, reports, notes, and experimental results--covers the formative days of the reactor planning and became the cornerstone of the reactor electronic text project.

In July 2000 Russell S. Koonts, James M. Jackson Sanborn, and Maryjo George determined which documents would be included in the reactor project. The team reviewed documents listed in the 1999 survey and selected documents for scanning based on subject matter, level of perceived digitization difficulty, and interest to the processing individual. Clifford Beck's 1950 "Proposal of a Nuclear Reactor at North Carolina State College" was the first document chosen because of its considerable length, its use of images, and significance in establishing the reactor program at North Carolina State College. Each document selected then required intensive work to prepare it for the Web.

Special Collections unveiled the "First Temple of the Atom Electronic Text" Web site in December 2000 (http://www.lib.ncsu.edu/archives/etext/engineering/reactor), which coincided with the Department of Nuclear Engineering's fiftieth anniversary celebration. Additionally, a site devoted to Murray's notebook is accessible at http://www.lib.ncsu.edu/archives/etext/engineering/reactor/murray/index.html.

When Special Collections began the project, it hoped to make documents available to a wide range of individuals who might not otherwise be able to examine the documents. Since its public announcement, the site has been used in presentations, classwork (by students and professors), and historical and genealogical research. The site has been indexed by Yahoo and Google and is included in the North Carolina Exploring Cultural Heritage On-Line (NCECHO) Web site, which provides access to the special collections of North Carolina's libraries, archives, and museums <http://www.ncecho.org/>.

First Temple of the Atom Document Digitization Process

The documents digitized for the reactor electronic text project went through a five-step process to make them accessible via the World Wide Web. Following is a brief description of the five steps.

Step 1: Digitization

The first method used to convert traditional paper documents into electronic texts is to scan them onto a computer. The initial high-resolution scan, or master image, is captured in color and saved as a TIFF file roughly forty-four megabytes in size. To produce smaller files appropriate for viewing on the Web, TIFF images are then converted and compressed to full-size and thumbnail JPG files. A thumbnail of the image is used to display alongside the document text when viewing the digitized item and provides a link to the larger image of the original document.

Step 2: Transcribing/Converting Text

After a document is scanned, the text must be either transcribed or converted into editable text. If the item is merely a few pages, hand written, or contains poor quality typeface, library staff transcribe the document. If the document is longer, typed, and has a clear typeface, the library uses TextBridge9.0, an Optical Character Recognition (OCR) program, to capture the text for future encoding. TextBridge imports the TIFF file and converts the image to a text-based file. After conversion, the converted text is compared to the original for accuracy. Conversion using OCR can reach a 98 to 99 percent accuracy rate.

Step 3: Encoding

The third step uses an encoding language to convert the text-based file into a format viewable on the Web. Special Collections projects use an eXtensible Markup Language (XML), the Text Encoding Initiative (TEI) tag subset named teixlite. The TEI is an international project that is developing guidelines for the preparation and interchange of electronic texts for scholarly research. The teixlite subset, which consists of the most widely used tags from the TEI standard, allows the library to identify people, places, dates, and other content within the documents by selecting the appropriate tag and attributes.

Step 4: Validating/Parsing/Viewing

In XML, tagging and encoding is controlled by industry-wide standards. Tags must be opened and closed in a precise order and must adhere to strict guidelines relating to usage and placement. Such definitions appear in a "document type definition" or dtd. To ensure that library documents meet teilite practices, Special Collections runs a program that checks tag formations against the rules set forth in the dtd. Referred to as parsing, the program reports any errors it encounters as it checks the document tags and their locations. When an error is reported, the encoder locates and fixes the error.

Step 5: Providing Access

Today's Internet browsers do not have the capability to display XML-based documents without translation. To make XML files available to the public, they are translated into HTML documents using the eXtensible Stylesheet Language (XSL). For example, an XSL stylesheet converts the XML tags <title render="italic">…</title> into the HTML tags <i>…</i> for the sake of display. However, the original file retains the XML tags to provide increased searching capabilities. Search engines can then be programmed to search only for words within specific tags instead of a generic keyword search (e.g., one can find all documents with the date of 1952 and Raymond Murray as the author) Presently, Internet Explorer 5.5 and 6 are the only commercially available browsers that support XML/XSL translated documents. Because of browser limitations, Special Collections provides both XML and HTML versions of the reactor documents.

 

NCSU Libraries Copyright | Disclaimer | Accessibility | Text Only | Contact Us | Staff Only NC State University