textfiles/magazines/PACS/rev01.03

4968 lines
209 KiB
Plaintext
Raw Permalink Normal View History

2021-04-15 11:31:59 -07:00
+ Page 1 +
----------------------------------------------------------------
The Public-Access Computer Systems Review
Volume 1, Number 3 (1990) ISSN 1048-6542
Editor-In-Chief: Charles W. Bailey, Jr.
University of Houston
Associate Editors: Leslie Pearse, OCLC
Mike Ridley, McMaster University
Editorial Board: Walt Crawford, Research Libraries Group
Nancy Evans, Library and Information
Technology Association
David R. McDonald, Tufts University
R. Bruce Miller, University of California,
San Diego
Paul Evan Peters, Coalition for Networked
Information
Peter Stone, University of Sussex
Published three times a year (Winter, Summer, and Fall) by
the University Libraries, University of Houston. Technical
support is provided by the Information Technology Division,
University of Houston. Circulation: 1,883.
----------------------------------------------------------------
Editor's Address: Charles W. Bailey, Jr.
University Libraries
University of Houston
Houston, TX 77204-2091
(713) 749-4241
LIB3@UHUPVM1
Articles are stored as files at LISTSERV@UHUPVM1. To retrieve a
file, send the e-mail message given after the article abstract to
LISTSERV@UHUPVM1. The file will be sent to your account.
Back issues are also stored at LISTSERV@UHUPVM1. To obtain a
list of all available files, send the following message to
LISTSERV@UHUPVM1: INDEX PACS-L. The name of each issue's table
of contents file begins with the word "CONTENTS."
+ Page 2 +
CONTENTS
COMMUNICATIONS
Library Information System II: Progress Report and Technical Plan
Denise A. Troll (pp. 4-29)
To retrieve this file: GET TROLL PRV1N3
The University of Guelph Library's SearchMe Public-Access
Catalogue
George Loney (pp. 30-43)
To retrieve this file: GET LONEY PRV1N3
SPECIAL SECTION ON THE SPIRES SYSTEM
An Overview of SPIRES and the SPIRES Consortium
Bo Parker (pp. 44-50)
To retrieve this file: GET PARKER PRV1N3
Mounting Commercial Databases Using the SPIRES DBMS
Slavko Manjlovich (pp. 51-57)
To retrieve this file: GET MANJLOVI PRV1N3
LITMSS: Princeton's SPIRES Manuscripts Database
John Delaney (pp. 58-76)
To retrieve this file: GET DELANEY PRV1N3
The Libraries at Rensselaer Implement Access to Information
Beyond Their Walls
Pat Molholt (pp. 77-82)
To retrieve this file: GET MOLHOLT PRV1N3
+ Page 3 +
Mounting a Full-Text Database Using SPIRES
Walter Piovesan (pp. 83-88)
To retrieve this file: GET PIOVESAN PRV1N3
The WatMedia Project
Mark Ritchie (pp. 89-95)
To retrieve this file: GET RITCHIE PRV1N3
DEPARTMENTS
Public-Access Provocations: An Informal Column
"Future User Interfaces and the Common Command Language"
Walt Crawford (pp. 96-99)
To retrieve this file: GET CRAWFORD PRV1N3
Recursive Reviews
"Hypermedia, Interactive Multimedia, and Virtual Realities"
Martin Halbert (pp. 100-108)
To retrieve this file: GET HALBERT PRV1N3
Review
"MediaTracks"
Steve Cisler (pp. 109-115)
To retrieve this file: GET CISLER PRV1N3
----------------------------------------------------------------
The Public-Access Computer Systems Review is Copyright (C) 1990
by the University Libraries, University of Houston. All rights
reserved.
Copying is permitted for noncommercial use by computerized
bulletin board/conference systems, individual scholars, and
libraries. Libraries are authorized to add the journal to their
collection at no cost. This message must appear on copied
material. All commercial use requires permission.
----------------------------------------------------------------
+ Page 109 +
----------------------------------------------------------------
The Public-Access Computer Systems Review 1, No. 3 (1990):
109-115.
----------------------------------------------------------------
----------------------------------------------------------------
Review
----------------------------------------------------------------
"MediaTracks"
By Steve Cisler
The first piece of advanced library technology that I used was in
1950. The branch librarian handed me a shoe box full of
photographs, showed me how to insert one in the Stereopticon, and
went on to the next person in less than a minute. The only time
I needed her help after that initial session was for the storage
and retrieval of the shoe box. Then libraries began using
electricity for more than lighting and telephones, and the game
changed completely.
The first time I used a computer was in 1984. The California
State Library administered an LSCA grant to provide public access
computers in dozens of public libraries around the state.
Training the staff to use them was one of the first phases of the
project. Choosing hardware and software for purchase was another
phase, and making it accessible to the public was the longest and
most difficult phase. Having worked in a branch that had been
showered with a very rich selection of audio-visual equipment in
the early seventies, I was well aware of the time it would take
to train staff and public to use any one piece of equipment,
whether it was an 8-mm film loop player, a videotape recorder, or
a computer.
Our staff was willing but felt they were overworked, even before
the 128 KB Macintosh arrived with a drawing program, MacWrite,
and a spreadsheet. I re-wrote the manuals, digesting the basics
into eight-page pamphlets aimed at certain tasks that we expected
most people to tackle. Each staff member was able to instruct a
novice and have them pecking away at a word processing document
after about fifteen minutes of one-on-one instruction. However,
the Macintosh was being used over 100 hours a month, and many of
the people were first time users. The fifteen minute sessions
began to add up quickly, and some of the staff began to tire of
explaining over and over how a mouse worked, how to open a
document, and how to save (or trash) a file. Answering the same
repetitious questions affects some staff more than others, and
most of us can use some assistance in the form of instructional
aids.
+ Page 110 +
People learn in many different ways. Sitting in a lecture hall,
taking notes, and then digesting and applying them to an exercise
is a classical method. Perhaps the most effective is to be
tutored by an interested, sensitive teacher or friend, but for
some reading a manual (computer, unassembled toy, or software)
and then struggling alone is the most productive way to master a
machine or program. Self-paced tutorials can be very effective
for introductions to new technology or for specific tasks such as
using an interactive videodisc or logging on to a multiuser
database.
MediaTracks
Making these tutorials has been very complex and time consuming,
whether they are on paper or are in electronic format, but a new
product from Farallon Computing has changed this. It puts the
production of library-specific tutorials into librarians' hands.
As with many Macintosh programs you don't spend time fiddling
with the interface or learning new commands. Your time is
devoted to the tasks which the computer is supposed to
facilitate, not to struggling with the computer.
MediaTracks is comprised of several parts: a Screen Recorder
(which appeared as a separate program over a year ago) that makes
a virtual tape of real time events on your Macintosh screen, an
editing program for sound and graphics, and several playback
options. Screen Recorder is a Desk Accessory which runs while
other applications are operating. After naming the tape file the
recording begins, and a small control panel is displayed at the
bottom of the screen to record, pause, stop, load, or play the
tape. All of the actions you perform will be recorded in black
and white at the original speed. This demo tape can be edited,
integrated into a HyperCard presentation, or turned into a stand-
alone application that can be distributed without paying Farallon
any license fees.
+ Page 111 +
Many activities lend themselves to Screen Recorder. I have used
it to record activities that involved a complex equipment setup
such as a network of CD-ROMs, data news feeds from a satellite
receiver, an online session with a high speed modem, a LAN e-mail
system, or a workstation with a variety of multimedia tools. I
can play the tape at a conference, workshop, or other library
without hauling all the gear needed for the original. Besides
eliminating a lot of equipment for demonstrations, you have the
chance to make the demo work before showing it to others! Even
if I am doing a live presentation, I will carry a Screen Recorder
tape as a backup.
Editing
Until July, 1990, Screen Recorder could not be edited. Now that
it is included in MediaTracks, anyone who knows how to use a
Macintosh can modify a tape in a number of ways. Once you boot
MediaTracks and choose a tape to edit, a window appears with a
single frame at the left of the screen with the sprocket holes
stretching to the right.
Below the frames are five icons for playing, recording sound,
actions, drawing, and changing the view of the tape on the
editing board. At the right are two indicators that show the
starting time and duration of each frame. If you have made the
Screen Recorder tape, you may have an idea of how you want to
edit the session. If not, click on the play icon and think about
the natural breaks in the tape where you might want to highlight
important events and add sounds to explain a complex action.
After watching the tape once or twice you can press the "M" key
in order to divide it into clips or sections for further editing
or annotation. These marks may be removed if you decide they
were incorrectly placed, or if you wish to combine two clips.
+ Page 112 +
Marks are generally used to divide a demonstration or tutorial
into logical parts. If you are showing someone the basics of an
online service you would have an intro, the login sequence, the
help screens, a simple search for information and perhaps a more
complex search and a logoff sequence. Another use for marks is
to cut out dead time and mistakes. If your system is slow to
respond you can shorten the demo by cutting seconds from each
clip by marking and deleting periods of inactivity. If you
entered a wrong command or typo and then corrected yourself
during the initial Screen Recorder session, use marks to clean up
that part. If you speed up a session, let the user know the
actual session may be much slower.
Adding Graphics
First you can insert a title clip at the beginning of the tape.
Double-clicking on the clip opens up a screen and palette with
drawing tools for annotation. You can paste in graphics in color
or black and white whether it is a diagram of the library, a
network map, a scanned photograph of the reference staff, or a
list of choices for the user to pursue, i.e. logon, search,
or any part of the ensuing demo. Close the window after you
finish adding text or graphics. Because this is interactive, the
user may not want to watch the whole sequence but jump to new or
difficult parts of your tutorial. Another title clip can be
inserted elsewhere in the sequence. This can be useful if you
want the user to branch to a variety of choices later in the
demo.
Proceed to the next clip, double click on it, and use the arrows
and text boxes to highlight important parts of the screen
activity. Don't overwhelm the subject matter by using 36 point
type pointing to 9 point type on the screen.
+ Page 113 +
Incorporating Sound Using MacRecorder
Besides the graphics, the main addition is sound. Aside from a
librarian explaining how a device works, or how to navigate
through some information space, sound can be used to reinforce an
action or correct a mistake. Farallon makes a package called
MacRecorder that works with MediaTracks. The hardware is a bit
larger than the Macintosh mouse, plugs into a serial port, and
can record voice, input from a tape, VCR, CD, record, or radio in
digitized format. The length depends on the amount of RAM you
have, so be sure and remember who is going to play this tape.
The default setting is for a ten second recording using 256 KB of
RAM and sampled at 22Khz (about like AM quality sound). If you
have a 5 MB Mac IIcx, and all the tapes are going to run on 1 MB
Mac Pluses in the public area, you will have to keep your sound
files short or compressed. Spoken word suffers if it compressed
too much, but 30 seconds is not too long for a clip.
Explaining what is happening is the most common use of sound;
most libraries are not going to add theme music from Wheel of
Fortune while you wait for the results of a complex Boolean
search, though it might be fun to try. Be sure and have someone
with a lively voice do the recording. Don't put the user to
sleep. Prepare a script and storyboard once you have divided the
Screen Recorder tape into clips. For each clip write the
commentary, but make it brief. This can be on the Mac or on
paper. You may have to make several takes and listen to each one
until it sounds right.
Buttons and HyperCard
A MediaTracks file can be linked to HyperCard using buttons
generated in the graphics palette when you edit individual
screens. Each button can contain a HyperTalk script of 256
characters or less, so you can start a MediaTracks tape from
HyperCard and then control HyperCard from within a tape. Many
people have already used HyperCard for their tutorials, and will
use selected tape segments from MediaTracks to augment an
existing work.
+ Page 114 +
The file can also be played with MediaTracks Player, an 87 KB
application that may be distributed freely with the tapes that
you make. It has a control panel that is similar to a VCR.
Icons allow you to pause, stop, repeat, speed up, slow down,
rewind, skip forward/backward, fast forward, step frame by frame,
or hide the panel.
Finally, the file may saved as a stand-alone tape with the player
functions built-in. Double-clicking on the icon begins the tape.
Wrapping it up
All of these elements (sound, graphics, clips, and text) can be
cut, copied and pasted between parts of the file you are
constructing, from existing MediaTracks files and from other
Macintosh applications. If you have a special sound in a
HyperCard tour, it can be copied into a sound clip very easily.
If you have an opening title clip from one tape, it can be used
in another one. This makes it very easy to share and customize
library instruction done for another library. One of the
drawbacks for distribution by floppy disk is the size of the
final files. Uncompressed sound chunks at 256 KB each quickly
pushes the file size over the capacity of an existing floppy
disk. You can break up your tapes into pieces that will fit on
an 800 KB or 1.4 MB floppy. If you are going to transfer files
by hard disk or tape backup, you have no limitations on size.
For all of you DOS users: by running a program called PC-Soft,
you can make tapes of actual DOS program sessions and then use
the Mac to teach new users programs for either operating system!
The manual is well written and includes a bibliography for
further reading. For advanced users there are sections that help
you set up menus, multilevel presentations, and quiz clips which
can take the user back to elements of your demo for
reinforcement. The Apple Library Users Group (10381 Bandley Dr.
MS: 8C, Cupertino, CA 95014) has a template exchange for database
management and HyperCard templates. With MediaTracks we expect
to be exchanging tapes of common library activities: searching
CD-ROMs, using Internet and BITNET resources, and demonstrations
of OPACs. While there are none yet, perhaps this review will
help you decide to share your own efforts.
+ Page 115 +
Product Information
MediaTracks
Farallon Computing, Inc.
2000 Powell Street, Suite 600
Emeryville, CA 94608
(415) 596-9000
Prices (U.S. Dollars):
MM100 MediaTracks--295.00 (if you already own MacRecorder)
MM110 MediaTracks Multimedia Pack--495.00 (includes MacRecorder)
MM111 MediaTracks Multimedia Pack - CD ROM Version--495.00
(includes many sample MediaTracks demos)
MR200 MacRecorder Sound System 2.0--249.00
(Includes HyperSound, HyperSound Toolkit, and SoundEdit)
About the Author
Steve Cisler
Senior Scientist
10381 Bandley Drive
Cupertino, CA 95014
(408) 974-3258
SAC@APPLE.COM
-----------------------------------------------------------------
The Public-Access Computer Systems Review is an electronic
journal. It is sent free of charge to participants of the
Public-Access Computer Systems Forum (PACS-L), a computer
conference on BITNET. To join PACS-L, send an electronic mail
message to LISTSERV@UHUPVM1 that says: SUBSCRIBE PACS-L First
Name Last Name.
This article is Copyright (C) 1990 by Steve Cisler. All
Rights Reserved.
The Public-Access Computer Systems Review is Copyright (C) 1990
by the University Libraries, University of Houston. All Rights
Reserved.
Copying is permitted for noncommercial use by computer
conferences, individual scholars, and libraries. Libraries are
authorized to add the journal to their collection, in electronic
or printed form, at no charge. This message must appear on all
copied material. All commercial use requires permission.
----------------------------------------------------------------
+ Page 96 +
----------------------------------------------------------------
The Public-Access Computer Systems Review 1, No. 3 (1990):
96-99.
----------------------------------------------------------------
-----------------------------------------------------------------
Public-Access Provocations: An Informal Column
-----------------------------------------------------------------
"Future User Interfaces and the Common Command Language"
by Walt Crawford
With any luck at all, 1991 will finally see adoption of ANSI/NISO
Z39.58, Common Command Language (CCL). It's been a long,
difficult process to nail down a standard that can provide a
common means of access across many different catalogs and online
systems. But according to some people, it's too late: command
languages will be irrelevant for the online catalogs of the
future. These brave new catalogs will use Graphic User
Interfaces (GUIs) or WIMPs (Windows, Icons, Mice, and Pull-down
Menus); patrons will thus be guided painlessly and intuitively to
the material they need.
Well, maybe. I'd love to see the icon for "books about Japanese
baseball, published since 1980 in English." Or, more simply, the
icon that will tell me whether the library has Norman Mailer's
book with a title something like "Fire on the Moon" without
plowing through dozens of authors and titles. (The title is "Of
a Fire on the Moon," so an alphabetic browse just might take a
while.) Painless? Intuitive? Plausible on a dial-up line from
home at 2,400 bps (if you're really lucky)?
No, this isn't going to be a jeremiad against GUIs or an
assertion that commands are the only good way to use a catalog.
But I will assert that access to a command line continues to
offer the fastest and most powerful way to perform complex
searches (where "complex" can be defined as anything other than a
one-index phrase search), and that access to direct command entry
would improve the usefulness of non-command-driven catalogs for
frequent users and dial-up/network users.
+ Page 97 +
CCL as a Secondary Interface?
CCL, probably the most widely-implemented not-yet-adopted
standard in the history of NISO and Z39, could become the
universal secondary access technique, available to power users
and dial-up/network users as an alternative to the user-friendly,
bandwidth-intensive, hardware-dependent, slow for complex
searches, GUI interface that is so much fun to use the first time
around.
Probably not all of CCL; most of the set-manipulation
capabilities and macro-creation capabilities are useful for
professional online searchers but overkill for patrons. Instead,
I'd expect to see "secondary CCL" looking more like the partial
CCL implementations that have been around (in some cases) for a
decade or more: the West Coast Group--BALLOTS/RLIN (the
original), Melvyl, Orion, Carlyle, and the like.
Yes, you can implement the logic of CCL in a GUI with icons,
buttons and dialog boxes for the inevitable search text, and it
would make an interesting design; I'd love to try one out. But
it makes sense to have plain old CCL available from the keyboard
as well; why penalize library users who find text comfortable?
The Return of the Command Line
I find it interesting that one significant improvement in PC
Tools Deluxe 6 over PC Tools Deluxe 5.5 is that Version 6, which
uses a well-designed "graphic" user interface, includes a command
line within the interface window. You don't ever need to use
it--but when you want the speed and power of the DOS prompt, you
can mouse down to it and use it. Amiga users have noted for some
years that they have the best of both worlds: the Amiga user
interface is GUI in the extreme, but a command line is
immediately available for the times when it's the best, fastest
way to get the job done.
+ Page 98 +
Understand, I do use GUIs. I can't imagine using Ventura
Publisher as a pure command-driven system; ditto for any painting
or drawing program. When I'm revising text in Microsoft Word,
the mouse does come into play--and it certainly gets used in
Quattro Pro. And yes, I find PC Tools much easier and more
powerful at home (with a mouse and color screen) than at work
(without a mouse, and with a monochrome screen). I'm
text-oriented, but I'm no bigot.
Click on the Jar, then the Anteater, then the Piano. . .
Two or three years ago, two or three of us considered designing a
truly graphic online catalog interface as a joke (after you got
past the icons for indexes, you'd have twenty-six icons to narrow
the search: an Anteater, a Bell, a Cat, a Dog. . .on up to a
Xylophone, Yacht and Zebra). We never prepared the demo for two
reasons. For one thing, back then it would have been quite a bit
of work. More importantly, though, we realized that people would
take it seriously--after all, words are such a nuisance when
you're looking for a book!
Comments?
What do you think? Does the future really omit the command line,
or will mixed environments thrive? (Tried any good touch-screen
catalogs lately?)
Those aren't simply rhetorical questions. I'm gearing up for
another project on patron access, and your comments might help me
to broaden my narrow-minded perspectives. Please send brief
comments to my e-mail address and more lengthy ones to my regular
mail address.
Meanwhile, whether in its pure form or embedded within a rich
graphic interface, CCL offers the best chance for common entry
points to diverse online systems. I hope to see it popping up in
new offerings and revisions of current offerings, old-fashioned
as commands may be.
+ Page 99 +
About the Author
Walt Crawford
The Research Libraries Group, Inc.
1200 Villa Street
Mountain View, CA 94041-1100
BR.WCC@RLG.BITNET
-----------------------------------------------------------------
The Public-Access Computer Systems Review is an electronic
journal. It is sent free of charge to participants of the
Public-Access Computer Systems Forum (PACS-L), a computer
conference on BITNET. To join PACS-L, send an electronic mail
message to LISTSERV@UHUPVM1 that says: SUBSCRIBE PACS-L First
Name Last Name.
This article is Copyright (C) 1990 by Walt Crawford. All
Rights Reserved.
The Public-Access Computer Systems Review is Copyright (C) 1990
by the University Libraries, University of Houston. All Rights
Reserved.
Copying is permitted for noncommercial use by computer
conferences, individual scholars, and libraries. Libraries are
authorized to add the journal to their collection, in electronic
or printed form, at no charge. This message must appear on all
copied material. All commercial use requires permission.
----------------------------------------------------------------
+ Page 58 +
-----------------------------------------------------------------
Delaney, John. "LITMSS: Princeton's SPIRES Manuscripts
Database." The Public-Access Computer Systems Review 1, no. 3
(1990): 58-76.
-----------------------------------------------------------------
1.0 Introduction
LITMSS is an online database of information about modern (post-
1500 A.D.) manuscript holdings of Princeton University Library's
Department of Rare Books and Special Collections. It emphasizes
eighteenth-, nineteenth-, and twentieth-century manuscripts whose
primary language is English, though significant Spanish and
French holdings are described as well. It includes, at the
collection level, all of the Department's administrative units
that house manuscripts--Manuscripts, Theatre, Western Americana,
and Mudd Library of Public Affairs Papers--except Archives (to be
added). In addition, it contains in-depth information about
folder- and item-level holdings of the Manuscripts Division and
several units' miscellaneous manuscripts files.
Over 15,500 individuals have been indexed in LITMSS--artists,
novelists, presidents, generals, scientists, educators, etc.--and
several thousand subjects (defined with Library of Congress
subject headings) have been identified/associated with the 1,000
collections described in the database. In all, over 55,000
records are searchable through both find (keyword) and browse
(phrase) indexes.
This article covers the evolution of the database, the scope and
contents of its records, the public "face" of the database in
FOLIO, searching and display capabilities, and its structure of
interrelated SPIRES subfiles.
2.0 Brief History of Automation Efforts
Princeton University Library's Department of Rare Books and
Special Collections consists of a group of subject-oriented
administrative units, each of which has its own curator, its own
staff, and its own physical location, including reading room and
reference area. Some are located within Firestone Library, on
different floors; others are housed in different campus
buildings. All possess manuscripts and/or "special collections"
of materials that are commonly part of manuscript collections,
such as photographs. It has taken a decade to achieve the kind
of centralized control over all of the Department's manuscript
collections that exists today in the form of LITMSS.
+ Page 59 +
With the goal of producing a guide to its literary holdings, the
Manuscripts Division began in 1980 to create machine-readable
records for its holdings.
2.1 ISIS
The first database employed a batch mode version of ISIS
(Integrated Set of Information Systems), a system developed by
the International Labour Office and later updated and maintained
by UNESCO. The Office of Population Research on the Princeton
campus installed ISIS in 1977, and staff from that office helped
the Manuscripts Division define the data elements it needed for a
database of it own. Because of its literary focus, it was called
LITMSS.
ISIS capabilities allowed sorting on any field and subfield, and
the system could combine sort fields for multilevel sorts. For
example, a primary sort could be performed on a combined set of
authors and corporate bodies, and secondary, tertiary, and other
sorts could be done on other elements. Searches could use
Boolean logic, and text in any field (ninety-eight fields had
been defined!) could be searched. In addition, a flexible
formatting language permitted one to format the output of queries
in virtually unlimited ways, using vertical and horizontal
spacing, conditional and unconditional literals, up to four
levels of headings, and columns. All of this computer "magic"--
quite rudimentary in hindsight--had a profound and positive
effect on departmental staff: members began to see tremendous
possibilities for automation in manuscripts work.
2.2 SPIRES
By 1985, after several years of a Title II-C grant and three
years of funding from the National Endowment for the Humanities,
the database had grown to over 45,000 records. Support for ISIS
at the campus computer center continued to wane, however, as the
university introduced newer, more state-of-the-art database
management systems to its computer users. SPIRES, the Stanford
Public Information Retrieval System, was a powerful product
attracting a good deal of attention, and by 1984 the university
had already joined the consortium of sites that were using it.
Supported by the computer center and backed by a network of
diverse users, SPIRES offered the Department an attractive
alternative to ISIS. In the spring of 1986, the manuscripts
database was converted to SPIRES, beginning its online phase.
+ Page 60 +
The database was still not publicly available, but its printouts
were. Periodically, usually once each year or after every 5,000
records, the database was dumped, providing a multi-volume
printout of entries, sorted by author. At a glance, the user
could find the locations and descriptions of all manuscripts
pertaining to a particular person that had been indexed to date.
The manuscripts curator often photocopied pages of the reference
work to aid in her answering of mail queries. Printed indexes,
identifying collections of manuscripts by subjects and forms of
material, were also available. In addition, printouts could be
customized by request.
It became clear during these years that the publication of a
literary guide was too limited a goal, for it would only
partially represent the variety and significance of the
Department's holdings. As a result, a concerted effort began in
1986 to fully describe, at the collection level, all of the
manuscript collections in all of the Department's administrative
units. With the publication in July 1989 of A Guide to Modern
Manuscripts in the Princeton University Library (Boston: G.K.
Hall & Co.), that larger goal was accomplished.
2.3 Public Access to LITMSS
Once the size of the database had reached a "critical mass",
public access to it made more sense. The printouts had always
been available--at least to readers that visited the Department--
but now the power and convenience of the computer, staff thought,
could and should be made available to anyone who had access to
the university's mainframe. During this period of gradual shift
in departmental emphasis, from needing to intellectually control
the Department's manuscript holdings to wanting to expand access
to, and promote use of, the material, the main library was
closing its card catalog and providing only online access to
post-1980 (January) cataloged acquisitions. In addition, local
network connections to the online library catalog were opened so
that access was available from personal computers anywhere on
campus.
+ Page 61 +
Into this rapidly developing environment of accessible
information, LITMSS made its first public appearance in the fall
of 1989 through a SPIRES interface called FOLIO, where the
database is simply called "Manuscripts." In Folio, data is
displayed line by line; hence, full-screen terminals are not
needed, thereby broadening its applicability. Only searching is
permitted, and only selected data elements may be seen. In
addition, searches can be logged so that database owners can see
how the database is being used and whether users are having any
problems. Since the campus computer center is mounting other
public databases (like GPO documents) in FOLIO, the Department
hopes that this shared interface will promote use of LITMSS even
more.
Local or remote network users can access the FOLIO database using
an anonymous logon capability. Some system capabilities (i.e.,
saving, printing, and mailing searches) are only available to
users with regular accounts on a Princeton mainframe.
3.0 Collections in LITMSS
As a departmental manuscripts database, LITMSS describes
manuscript holdings of the whole Department of Rare Books and
Special Collections, not just its Manuscripts Division. Other
administrative units of the Department maintain manuscript
collections that pertain to their particular subject
orientations, and these collections are represented in LITMSS.
Excluded, however, are manuscripts in non-Romance languages, such
as Persian and Arabic, medieval codices, papyri, and cuneiform
tablets. The emphasis is on post-1600 ("modern") manuscripts in
English, with lesser amounts in Spanish and French.
Below is a brief summary of each unit's covered "manuscript" [1]
holdings and the names of some representative collections.
+ Page 62 +
3.1 Manuscripts Division
The Manuscripts Division has over 650 collections, ranging in
size from one box of documents of the signers of the Declaration
of Independence to hundreds of boxes in the Archives of Charles
Scribner's Sons, the New York publisher. Its strengths are in
American and English history and literature. It includes the F.
Scott Fitzgerald Papers, the M. L. Parrish Collection of
Victorian Novelists, the records of Henry Holt & Co., the
archives of Story Magazine and Story Press, several Ernest
Hemingway collections, the Janet Camp Troxell Collection of
Rossetti Manuscripts, the Mario Vargas Llosa Papers, a Woodrow
Wilson collection of personal and family papers, and the Andre de
Coppet Collection of Americana, including manuscripts of all the
presidents from Washington to Truman.
3.2 Seeley G. Mudd Manuscript Library
The Seeley G. Mud Manuscript Library has over 150 collections,
ranging in size from one box of documents relating to Adolf
Hitler to hundreds of boxes of the American Civil Liberties
Union. Its strengths are in twentieth-century statecraft and
public affairs papers. It includes the John Foster Dulles
Papers, the David E. Lilienthal Papers, the Albert Einstein
Duplicate Archive (photocopies), Fight For Freedom, Inc.,
Archives, Council on Books in Wartime Archives, and the James
Forrestal Papers.
3.3 Theatre Collection
The Theatre Collection has over 100 collections, ranging in size
from one box of material relating to calypso music to hundreds of
boxes in the Warner Bros. It is an archive that contains only
business records. Its strengths are in performing arts and
popular entertainment. It includes the William Seymour Family
Papers, the McCaddon Collection of the Barnum and Bailey Circus,
manuscripts of Woody Allen, the McCartre Theatre (Princeton)
Archives, and correspondence of Luigi Pirandello.
+ Page 63 +
3.4 Western Americana Division
The Western American Division has over 50 collections, ranging in
size from a portfolio of photographs of Eskimos to hundreds of
boxes of the Association of American Indian Affairs. Its
strengths are in overland narratives, Mormon material, indigenous
American languages, and twentieth-century American Indian
affairs. It includes the Philip Ashton Rollins Collection,
cattle ranch account books, the Herbert S. Auerbach Collection on
Mormons and Indians, and San Juan Pueblo records.
4.0 LITMSS Records
Of the more than 55,000 records in LITMSS, only about 1,000 are
collection records (for the 1,000 collections in the Department);
the rest are indexing records.
Each collection record describes a manuscript collection (as
defined before), and includes such elements as main entry (if
appropriate), collection name, range of dates of the material,
scope and contents, physical size (in cubic feet), arrangement
(the organization of the manuscripts and any series names),
subject/title/form headings appropriate to the material, and any
restrictions that may pertain to the collection.
Acquisition and other in-house information are present in the
collection record and are available to departmental staff, but
such elements are purposefully omitted from the FOLIO displays.
Indexing records, which constitute the bulk of the records in
LITMSS, describe folder- or item-level holdings of manuscripts of
specific individuals. The purpose of this indexing is to make
known the whereabouts (i.e., non-obvious locations) of
manuscripts of "significant" [2] individuals and to provide the
Department an additional measure of security over its holdings.
A JOHN DOE collection of manuscripts would be described in LITMSS
in a collection record. Manuscripts of others in the
collection--his correspondents, for example--would be described
in indexing records. (Note: nothing of John Doe would be indexed
for him in his own collection).
To date, the manuscripts of approximately 15,500 individuals,
representing many academic disciplines and vocations, have been
indexed.
+ Page 64 +
Each indexing record contains the following elements: (1) main
entry; (2) collection name; (3) series name (if appropriate); (4)
box; (5) folder; and (6) a manuscripts "structure" (a SPIRES name
for a group of related elements that always occur together) that
describes the number of manuscripts, the type of manuscript(s),
the inclusive date(s) of the manuscript(s), and the manuscripts
themselves. Depending on the specific location
(collection/series/box/folder), an indexing record may describe a
single item or many.
To date, only about 350 of the Manuscripts Division's collections
have been indexed. In addition, each of the units'
"miscellaneous" manuscripts files, into which single accessions
are placed (for example, one George Washington letter donated by
an alumnus), have been indexed, as well as the modern manuscript
holdings of the Department's Robert H. Taylor Library of English
and American literature. Among the many authors amply
represented in the library are Richard Brinsley Sheridan, Max
Beerbohm, members of the Trollope family, Bernard Shaw, Virginia
Woolf, Henry James, the Bronte sisters, and Thomas Hardy.
5.0 Searching LITMSS
LITMSS contains two sets of indexes for retrieving records, FIND
(keyword) and BROWSE (phrase) indexes.
5.1 FIND Searches
The FIND indexes are word indexes that take the user's search
terms and respond with records whose specified elements contain
those words. LITMSS makes eight FIND indexes available through
FOLIO.
+ Page 65 +
-----------------------------------------------------------------
Table 1. FIND Indexes
-----------------------------------------------------------------
Index Name Description Sample Search
Term(s)
AUTHOR Creator of a manuscript Ernest Hemingway
NAT Nationality of the author French, Italian,
Swiss [3]
ID Brief identity of the author Journalist,
poet, senator
DISC Author's discipline / field Biology,
history,
government
COLL Name of a ms. collection Allen Tate
Papers
YEAR Year date of a manuscript 1955, 1778,
1812, 1920s
MS General type of manuscript Letter,
document, volume
STF Collection subjects/titles/forms Civil War, bills
of lading
-----------------------------------------------------------------
Boolean operators (AND, OR, NOT) are permitted with all FIND
indexes. As a result of this flexibility, rather sophisticated
searches are possible. For example, it is possible to do a
search for letters written by Italian poets during the 1920s.
The basic format of a search command using a FIND index is
find [index name] [search term]
Here are some examples:
find nat japanese
find id historian and date 1920
fin aut mark twain
FIND and index names can be abbreviated.
+ Page 66 +
Truncation (with the # sign) searching is also allowed on all of
these indexes:
fin year 192#
fin stf indian#
Except for the YEAR index, which can only retrieve indexing
records, and the STF index, which only retrieves collection
records, the FIND indexes make no distinction between the two
types of records in LITMSS: both may be displayed in a search
result, depending on the extent of Princeton's holdings. The
Department may have a JOHN DOE collection, several JOHN DOE
letters distributed among a few collections, or both. A search
for JOHN DOE material would find all of these records.
For example, a search for Aaron Burr material would produce the
following screen on the user's terminal or PC.
-----------------------------------------------------------------
Figure 1. Search for Aaron Burr
-----------------------------------------------------------------
Manuscripts / Search: Find AUTHOR AARON BURR
Result: 42 records
1) Burr, Aaron, 1716-1757 / [Collection *], Aaron Burr (1716-
1757) Collection / Consists of Burr manuscripts,
correspondence, and documents dating from the period
(1748-1757) he was president of the College of New Jersey,
now Princeton. Included are original manuscripts of sermons,
a Latin oration, and letters and documents, as.../ Date(s):
1750-1761 / Size: 1 box
2) Burr, Aaron, 1716-1757 / [Collection *], General Manuscripts
[Bound] / 1 volume(s), 1753-1758
3) Burr, Aaron, 1716-1757 / [Collection *], General Manuscripts
[Misc.] / 1 document(s), 1755
...
-----------------------------------------------------------------
By default, FOLIO displays retrieved records in a brief display
and numbers them on the left side for reference (see Section 6.0
for information about the display format and what it reveals).
In the above example, the first record is a collection record
(i.e., an Aaron Burr collection) and the other two are indexing
records.
+ Page 67 +
If one scanned the rest of the 42 records retrieved in this
search, he would see that Aaron Burr's son, Aaron Burr (1756-
1836), the famous duelist with Alexander Hamilton, is also
represented because both share the same name. To find only the
father's records, one would have to add a date in the search
phrase: "find author aaron burr 1716."
In the AUTHOR index, real names and pseudonyms are indexed
together so that a search under one name will retrieve the same
records as a search under the other. For example, searching
under "Mark Twain" will find the same records as searching under
"Samuel Langhorne Clemens." (How this works is described in the
Section 7.0.)
5.2 BROWSE Searches
The BROWSE indexes are phrase indexes that attempt to match the
user's whole search phrase with headings in the database's
records. The system responds with an alphabetical listing of
headings drawn from records that include the phrase or, that
failing, contain headings which alphabetically precede and follow
the user's phrase. In this way, the user can browse through
headings as if he/she were using the library's card catalog.
There are two BROWSE indexes available in FOLIO for LITMSS: name
and subject.
-----------------------------------------------------------------
Table 2. BROWSE Indexes
-----------------------------------------------------------------
Index Name Description Sample Search
Phrase
NAME Phrase of author's inverted name Hemingway,
Ernest, 1899-
1961
SUBJECT Added entry for a collection United States--
Civil War...
-----------------------------------------------------------------
+ Page 68 +
The basic format of a search command using a BROWSE index is
browse [index name] [phrase]
Here are some examples:
browse name twain, mark
bro sub tammany hall
BROWSE and index names can be abbreviated.
Truncation (without using the # sign) is automatic:
bro name james, h
bro sub united states--history--civil
The BROWSE feature of FOLIO is an especially useful one because
the user, as he/she browses, also sees the number of records
associated with each heading.
For example, browsing in the name index for "burr, aaron" would
retrieve the following result.
-----------------------------------------------------------------
Figure 2. Example BROWSE Search
-----------------------------------------------------------------
Manuscripts / Search: Browse NAME BURR, AARON
Result filed under the following headings:
-3) Name: Burnshaw, Stanley, 1906- (5 records)
-2) Name: Burnside, Ambrose E. (Ambrose Everett, 1824-1881
(3 records)
-1) Name: Burpee, Lawrence J. (Lawrence Johnston), 1873-1946
(1 record)
0) Name: Burr, Aaron, 1716-1757 (14 records)
1) Name: Burr, Aaron, 1756-1836 (28 records)
2) Name: Burr, Amelia Josephine, 1878- (1 record)
3) Name: Burr, Anna Robeson, 1873-1941 (4 records)
-----------------------------------------------------------------
+ Page 69 +
For reference, FOLIO numbers the headings on the left, forward
and backward from 0, which identifies the first heading that best
matches the search phrase. One can see from this result that the
two groups of Aaron Burr records equal the number of records
retrieved in the FIND search described above (14 + 28 = 42). In
other words, there are two ways to get the same author
information.
Similarly, there are two ways to retrieve subject information
about Princeton's manuscript collections: the STF FIND index and
the SUBJECT BROWSE index. [4] Browsing subjects, however, is
more successful if one is familiar with Library of Congress
Subject Headings since they are used in the collection records.
FOLIO recognizes the dash ("--") in search phrases, and thus its
presence or absence can make a difference in the results one
obtains. A search for Civil War collections could be phrased
"fin stf civil war" for the STF index, but in the SUBJECT index
one would have to know that the appropriate subject heading for
the Civil War is "United States--History--Civil War, 1861-1865."
Omitting the first dash in the latter phrase would produce very
different results. (In the BROWSE indexes the system always
attempts to find a match character by character, starting from
left to right.)
6.0 Displaying Records
Users can see LITMSS records in either brief or full displays.
6.1 Collection Records
For collection records, the brief display consists of the name of
the main entry (if the collection has one), the name of the
collection, the first 250 characters of the record's scope note,
the inclusive dates of the collection, and its size (number of
boxes, containers).
+ Page 70 +
The full display for collection records has three parts--Name,
Location, and Description--each of which can be displayed
independently if desired. The Name section provides the main
entry's full name (AACR2 form), a brief biographical phrase about
him/her/it, and any "disciplines," or occupational fields, for
which the main entry is known. The Location section identifies
the administrative unit of the Department that houses the
collection, providing the collection's name, dates, and physical
characteristics. In the Description section, the display
provides the record's complete scope note, arrangement (if the
collection is greater than one box in size), and list of related
subject, title, and form headings.
A brief display of a collection record is shown below.
-----------------------------------------------------------------
Figure 3. Brief Display of a Collection Record
-----------------------------------------------------------------
Burr, Aaron, 1716-1757 / [Collection *], Aaron Burr (1716-1757)
Collection / Consists of Burr manuscripts, correspondence,
and documents dating from the period (1748-1757) he was
president of the College of New Jersey, now Princeton.
Included are original manuscripts of sermons, a Latin
oration, and letters and documents, as... / Date(s):
1750-1761 / Size: 1 box
-----------------------------------------------------------------
+ Page 71 +
A full display of a collection record is shown below.
-----------------------------------------------------------------
Figure 4. Full Display of a Collection Record
-----------------------------------------------------------------
Name Burr, Aaron, 1716-1757
American Presbyterian clergyman, president of
the College of New Jersey (Princeton)
Discipline(s): religion, education
Location Manuscripts Division
[Collection *], Aaron Burr (1716-1757)
Collection
Date(s): 1750-1761
Size (in cubic feet): 0.25
Container count: 1 box
Description
Consists of Burr manuscripts, correspondence,
and documents from the period (1748-1757) he was
president of the College of New Jersey, now
Princeton. Included are original manuscripts of
sermons, a Latin oration, and letters and documents,
as well as photostats and copies of additional
material. There are also a contemporary silhouette
of Burr and a letter, dated 1761, presenting a bill
to his estate.
Subjects/Titles/Forms of the Manuscripts:
American orations--Colonial period, ca. 1600-1775
Burr, Aaron, 1716-1757--Silhouettes
Clergy--United States--18th century--Letters
College presidents--New Jersey--Princeton--
18th century--Letters
Presbyterian Church in the U.S.A.--Clergy--
18th century--Letters
Princeton University--History--Colonial period,
ca. 1600-1775--Sources
Sermons, American--18th century
Silhouettes--United States--18th century
-----------------------------------------------------------------
An asterisk in the collection name signifies that the collection
has been processed and indexed.
+ Page 72 +
6.2 Indexing Records
For indexing records, the brief display consists of the main
entry, the name of the collection, the number of manuscripts
described in the record, the type of manuscript(s) described, and
inclusive dates.
The full display contains the same three parts (Name, Location,
Description) offered for a collection record, but the Location
and Description elements are different.
In an indexing record, the Location section identifies the
specific address of the manuscript(s) being described:
administrative unit, collection name, series name, box number,
and folder title or number. The Description section expands the
information in the brief display by adding a full description
element.
A brief indexing record is shown below.
-----------------------------------------------------------------
Figure 5. Brief Display of an Indexing Record
-----------------------------------------------------------------
Hemingway, Ernest, 1899-1961 / [Papers *], Sylvia Beach Papers /
1 document(s), 1923
-----------------------------------------------------------------
A full indexing record is shown below.
-----------------------------------------------------------------
Figure 6. Full Display of an Indexing Record
-----------------------------------------------------------------
Name Hemingway, Ernest, 1899-1961
American novelist, journalist, storywriter
Discipline(s): literature
Location Manuscripts Division
[Papers *], Sylvia Beach Papers
Box: 171
Folder: Corres. re Illustrations
Description
Number of original manuscripts: 1
Manuscript type: document
Date(s): 1923
Description: photograph of Hemingway in Sylvia
Beach's bookshop (Paris), SHAKESPEARE
AND COMPANY
-----------------------------------------------------------------
+ Page 73 +
6.3 Other LITMSS Output Features
To display LITMSS records in FOLIO, one uses the reference
numbers on the left side of the search display to specify which
records are wanted.
For example, in the Aaron Burr search described previously that
resulted in 42 records found, one could issue the command
"display full 35" (abbreviated "df 35") to see a full display of
the 35th record, or one could ask to see a range of records
("display full 20-24"). With a large search result, one can use
the SCAN command to move back and forth through the records; for
example, typing the command "scan 30" would cause the system to
start its display over beginning at the 30th record.
FOLIO also permits the user to print search results on a system
printer or any other named printer, to save results in computer
files, and to "mail" results over electronic networks to other
accounts; the records can be in either brief or full form.
7.0 LITMSS Subfiles
Besides the main MANUSCRIPTS subfile [5] in which collection and
indexing records reside, LITMSS consists of several other linked
subfiles, including an AUTHORS subfile and a COLLECTIONS subfile.
While they are invisible to the user of LITMSS in FOLIO, they
contain indexes that are indirectly used in some of the FOLIO
searches. The linkages are provided by code numbers: an author
code number and a collection code number. The use of these
numbers in collection and indexing records makes inputting and
updating of records easy and efficient.
A typical author record in the AUTHORS subfile looks like this.
-----------------------------------------------------------------
Figure 7. Example Author Record
-----------------------------------------------------------------
AUTHOR.CODE 00797
AUTHOR Twain, Mark, 1835-1910
ALIAS Clemens, Samuel Langhorne
NATIONALITY American
IDENTITY novelist, humorist, storywriter
DISCIPLINE literature
REFERENCES OxAm
AmA&B
DcLEnL
-----------------------------------------------------------------
+ Page 74 +
The main entry of all collection and indexing records contains
the particular author's five-digit code number, not his/her name
or pseudonym.
For example, when inputting Mark Twain records, the cataloger
only has to specify "00797" in the author element. If, at a
later date, new information becomes available, such as a death
date or the addition of a middle name, only the AUTHORS record
has to be modified--all of the associated collection and indexing
records can remain untouched because they are still linked by the
author code number, which never changes.
When author information is actually provided in FOLIO, the author
record from the AUTHORS subfile is called up by the full display
format. Searches that use the AUTHOR, NAT, ID, and DISC indexes
are actually using AUTHORS subfile indexes to retrieve the
appropriate author codes, which are then searched in the
MANUSCRIPTS subfile to find the associated collection and
indexing records.
In the same way, collection code numbers used in the COLLECTIONS
subfile simplify cataloging and updating for the Department's
processing staff. And, in users' searches, the subfile becomes a
"lookup table."
For example, the record below is for the Aaron Burr Collection in
the COLLECTIONS subfile.
-----------------------------------------------------------------
Figure 8. Example Collection Record
-----------------------------------------------------------------
COLLECTION.CODE C0090
COLLECTION.NAME [Collection *], Aaron Burr (1716-1757)
Collection
-----------------------------------------------------------------
A search for all of its records, both the collection record and
associated indexing records, "looks up" the collection name
("Aaron Burr Collection") in the COLLECTIONS subfile to find its
specific collection code (C0090) and then uses that number in the
MANUSCRIPTS subfile.
+ Page 75 +
8.0 Conclusion
LITMSS continues to grow. In the course of a year, approximately
3,000 to 5,000 indexing records are added to the database,
representing an additional 500-600 "authors" that have not been
established in the AUTHORS subfile before.
Ideally, the Department would like to have all of its manuscript
collections indexed and described in LITMSS and to be able to
stay current with new acquisitions. On the collection level,
this last goal has been achieved, for a temporary collection
record is created at the time each new manuscript collection is
acquired. The record is updated after processing, which may or
may not include indexing depending on departmental priorities and
staffing. LITMSS collection records are also input into the AMC
(Archives and Manuscripts Control) file of RLIN, the online
bibliographic database of the Research Libraries Group. Given
the backlog of unprocessed collections, however, which have been
described in the 1989 Guide, the Department will probably always
have to approach the first goal like an asymptote.
While work continues at the campus computer center to ease access
to all of the Princeton FOLIO databases, the Department is trying
to arrange a more equal distribution of responsibility for
inputting and updating LITMSS records--an arrangement whereby
each administrative unit would manage its own records, the
collection and indexing records that describe its manuscripts
holdings. At the moment, all of that responsibility resides in
the Manuscripts Division.
Notes
[1] Some of the collections in the Theatre Collection and Western
Americana units of the Department are not "manuscript"
collections. They are really "special collections" of non-
manuscript material--photographs, posters, and playbills--that
are unified by subject and place. The archival sense of the word
collection, however, pertains to all of the units represented
here: (1) an artificial accumulation of materials devoted to a
specific subject, person, place, event, or type of material; or
(2) a body of materials having a common source, created by a
person or corporate body as a natural function of the activities
he, she, or it pursues.
[2] Generally, only a few series of a manuscripts collection are
targeted for indexing, usually the correspondence series or
author series likely to contain the manuscripts of "others"
(i.e., other than the main entry). Even then, only those people
are indexed for whom there are good, authoritative biographical
reference sources. Given the time-intensive nature of authority
work, this indexing remains selective, not exhaustive.
+ Page 76 +
[3] There are so many English and American authors indexed in
LITMSS that searches using these nationalities without Boolean
qualification are not fruitful.
[4] Both indexes only retrieve collection records. The subjects
of manuscripts described in indexing records are not analyzed
because of the obvious amount of work that would be required of
processing staff. In effect, every indexed letter would have to
be read and interpreted according to LCSH subject headings.
[5] "Subfile" is a SPIRES term for a set of goal records, the
indexes to those goal records, and the access and update
restrictions that apply to the data elements of those records.
In essence, a subfile is a database.
About the Author
John Delaney
Leader, Rare Books and Manuscripts Cataloging Team
Department of Rare Books and Special Collections
Princeton University
One Washington Road
Princeton, NJ 08544
BITNET: Q3784@PUCC
-----------------------------------------------------------------
The Public-Access Computer Systems Review is an electronic
journal. It is sent free of charge to participants of the
Public-Access Computer Systems Forum (PACS-L), a computer
conference on BITNET. To join PACS-L, send an electronic mail
message to LISTSERV@UHUPVM1 that says: SUBSCRIBE PACS-L First
Name Last Name.
This article is Copyright (C) 1990 by John Delaney. All
Rights Reserved.
The Public-Access Computer Systems Review is Copyright (C) 1990
by the University Libraries, University of Houston. All Rights
Reserved.
Copying is permitted for noncommercial use by computer
conferences, individual scholars, and libraries. Libraries are
authorized to add the journal to their collection, in electronic
or printed form, at no charge. This message must appear on all
copied material. All commercial use requires permission.
-----------------------------------------------------------------
+ Page 100 +
-----------------------------------------------------------------
The Public-Access Computer Systems Review 1, No. 3 (1990):
100-108.
-----------------------------------------------------------------
-----------------------------------------------------------------
Recursive Reviews
-----------------------------------------------------------------
Hypermedia, Interactive Multimedia, and Virtual Realities
by Martin Halbert
When I fly to a conference I always find the moment of takeoff
exciting. After the tedium of airport lines and the slow process
of boarding, the engines rev up and you are suddenly thrust back
into your seat as the whole aircraft seems to strain trying to
vault into the sky. Then, in another moment, the perspective
dramatically changes as the ground is left below, the vast tangle
of roads and locations becomes abruptly apparent, as if one were
looking down at a map. You know in your bones then that you are
going somewhere, not just wasting time in Kafkaesque delays.
Today, technologies like hypermedia and interactive multimedia
are like a plane ready to take off, gathering momentum for a jump
that promises to take us to a new information environment.
Reading about these new computer tools one feels that we are
heading to an exciting, but unknown destination.
No one knows what the landscape of information technology in the
21st century will look like, but there are many sources that will
sketch the most prominent features. This column will direct the
reader to the best "guidebooks" to new interactive computer
technologies like hypermedia and virtual reality simulations. In
the spirit of Recursive Reviews, I won't try to limit the
discussion artificially to "just" hypermedia, or "just"
interactive multimedia. Instead, the aim will be to point out:
(1) practical sources that orient the reader to the newest
computer media technologies, and (2) new journals that discuss
the possibilities of the media.
+ Page 101 +
It may be objected that buzzwords like "hypermedia" and
"interactive multimedia" are not much more than hype currently.
The terms have been bandied about so much in the last few years
that it would be easy to conclude that they are nothing but empty
phrases that the industry has been using for impressive ad
campaigns. I don't agree with this. I think these concepts
represent a host of human-computer interaction ideas that the
most innovative thinkers have been developing for years, and
which are only now beginning to enter the mainstream. These
concepts are being embodied in the best new computer
applications, which will have dramatic impact on the work of all
information professionals in the 1990's.
Now is the time to become familiar with the issues surrounding
these technologies. But aside from the practical impacts on our
jobs, following the development of new computer technologies is
refreshing, and re-inspires us in our work. Innovations in
computer media are exciting news for libraries, which have only
dealt with one information medium for millennia.
As We May Think
There have been many seminal works that touched on the idea of
automated handling of large bodies of different kinds of media.
No discussion of the area would be complete without at least
mentioning Vannevar Bush's article "As We May Think" (Atlantic
Monthly, July 1945) which discussed a device called the MEMEX
that could retrieve and manipulate large quantities of microfilm,
audio recordings, and other media that would be of use to a
researcher. Bush had all the right ideas (e.g., multimedia and
automated links between pieces of information), but his article
is outdated because of the obsolete technological framework that
he uses to discuss his ideas.
+ Page 102 +
Computer Lib/Dream Machines
The first fully developed exposition of the idea of computer
manipulated media was the seminal book that introduced the term
"hypermedia," Computer Lib/Dream Machines by Theodore Nelson
(first published privately in 1974, later reprinted with
extensive updates in 1987 by Microsoft Press). If you read only
one new book this year, read Computer Lib. It is the most
insightful (and inciting!) book on computers that I know of.
Computer Lib was one of the most influential early works that
promoted the idea of personal computers. It had several themes:
(1) everybody should understand computers; (2) computer systems
are difficult to use only because they are designed poorly; and
(3) computers can be wonderfully empowering and enjoyable tools
when designed well.
The book is written in an engagingly chatty tone (the book was
consciously modeled after Stewart Brand's Whole Earth Catalog and
resembles it in many ways), and is full of tongue-in-cheek
pronouncements like "Computers are just as oppressive [in the
1980s] as before, but smaller and cheaper and more widespread.
Now you can be oppressed by computers in your living room."
Despite (or perhaps because of) all the humor in it, Computer Lib
is an illuminating survey of the major issues of making computers
usable. The flip side of the book (literally flip side, the book
is printed back to back with its sister title), Dream Machines,
canvases the most important ongoing developments in graphical
computer systems. If you want an entertaining, opinionated,
informative book on the fundamental issues of user interfaces,
read Nelson's book.
Hypertext Hands-On!
For a more sedate and neutral treatment of hypertext issues, turn
to Ben Shneiderman. Shneiderman is currently the most prominent
researcher in the field of human/computer interaction. His book
Hypertext Hands-On! is an excellent introduction to the topic
that lives up to its title by including a hypertext version of
the text on floppy disks.
+ Page 103 +
The book and hypertext are written in a very clear and concise
style. Hyperlinks in both the electronic and print versions are
easy to follow and logically arranged (unlike many hypertexts
I've run across which are tangled and confusing). The Hyperties
software runs fine on any PC-compatible, but, if you have a
Hercules monochrome monitor, it's difficult to spot most of the
text embedded hyperlinks. Because of this drawback, I preferred
using the print version of the work (sigh).
Shneiderman covers both theory and implementations of hypertext
systems. In his chapter on "Systems" he gives neutral
descriptions of all major hypermedia products that are currently
on the market. Also included in the work are examples of
possible hypertext applications and a review of major
personalities in the history of hypertext. Hypertext Hands-On!
could easily be used as a textbook introducing the subject of
hypermedia, and it is worth reading by anyone interested in the
field.
BYTE
For those interested in the nitty-gritty of current computer
systems and what they can offer, there is no better source than
the many trade journals and tabloids of the computer industry. I
offer up BYTE as a good one stop source for following personal
computer technologies. It is not particularly biased toward one
brand of computer, and is a monthly, so you will not be deluged
by the amount of reading entailed in following weekly tabloids.
The February 1990 issue had a particularly good in-depth section
that analyzed what interactive multimedia means to different
computer firms, what the pros and cons were of each company's
system, and what new technical issues were raised by interactive
media. My favorite article in the issue was "The Birth of the
BLOB" by Tim Shetler, which discussed data storage implications
of BLOBs (Binary Large OBjects, the nodes of multimedia
databases). If you want to know why DVI is important to IBM, or
why the Agnus blitter makes the Amiga display so good, read this
issue of BYTE, and future ones too.
+ Page 104 +
CD-ROM Professional
A magazine that falls somewhere between the trade magazine and
the academic journal is CD-ROM Professional. Subtitled "The
Magazine for CD-ROM Publishers and Users," it is aimed at
information professionals like librarians who want practical
advice articles. It has many product reviews, how-to columns,
and technology feature articles in each issue. Oriented
specifically to optical storage topics, it is one of the best
sources to follow interactive multimedia products in, since most
of these products come out on CD-ROMs currently.
The September 1990 issue is a good example of this journal. It
had an interview with Sony's chief multimedia spokesman, Takashi
Sugiyama, about where Sony is headed with the technology. The
same issue had articles on problems encountered in CD-ROM
technical support and how-to backup CD-ROM workstations.
ACM Journals
The Association for Computing Machinery generates a plethora of
journals on all aspects of computer technology. Three ACM
journals that are worth following regularly are the
Communications of the ACM, Computer Graphics, and SIGIR Forum.
Communications of the ACM features a special issue on interactive
technologies like multimedia and hypertext roughly once a year.
The July 1989 issue was devoted to interactive technologies and
had several good articles on digital video.
Computer Graphics (put out by ACM SIGGRAPH) is traditionally the
place where the hottest, glitziest new research projects in
computer graphics technology appear in living color. The March
1990 issue constituted the proceedings of the 1990 Symposium on
Interactive 3D Graphics, and showed amazing new levels of
sophistication. The issue is packed with project reports of the
newest technological buzzword, "virtual realities." Also called
microworlds, these are computer simulated environments. They may
be close simulations of physical reality (useful for simulating
physical systems), or they may be dazzlingly abstract
environments like the higher-dimensional "hyperworlds" viewable
with Columbia University's n-Vision system.
+ Page 105 +
The SIGIR Forum, a publication of ACM's SIG on Information
Retrieval, is an excellent journal for the information scientist
in all of us. The Fall 87/Winter 88 issue had an article by
Robin Hanson called "Toward Hypertext Publishing: Issues and
Choices in Database Design" that is the best piece on the
theoretical and practical concepts of hypertext systems that I
have seen yet. The best feature of Hanson's article is the
concise discussion of the various ways that one might run the fee
structure on a commercial hypertext network.
There are many other ACM publications that could be mentioned,
but these three are particularly valuable sources.
New interactive computer technologies are often dramatically
different from the standard office software that we are
accustomed to. I find it useful to follow journals that analyze
the possible uses of new computer media. Two new journals,
Hypermedia and Multimedia Review, feature scholarly discussions
of next generation information technology.
Hypermedia
Hypermedia regularly reviews an eclectic variety of conferences
and books related to hypermedia topics. Interestingly enough,
its first issue had a review of William Gibson's seminal science
fiction book Neuromancer in addition to more standard fare. In
my opinion, this was entirely appropriate, considering the fact
that many of Gibson's colorful SF concepts have been embraced
wholeheartedly by software designers.
My favorite Hypermedia article appeared in the Volume 1, Number 3
issue. It was a piece entitled "A Similarity-Based Hypertext
Browser for Reading the Unix Network News," by Michael H.
Anderson, Jakob Nielsen, and Henrik Rasmussen. The article
described a prototype user interface called HyperNews that
organizes incoming network news postings with hyperlinks
following discussion streams and an automatic
similarity/relevance rating feature (somewhat like fuzzy logic
information retrieval systems). Although the system described
was a prototype created solely for concept study, the need for
systems like this to follow the colossal amount of electronic
mail and forum postings is obvious (I often wish I had a working
system like the HyperNews prototype to handle all the PACS-L
messages I get every day).
+ Page 106 +
Multimedia Review
Multimedia Review is a fascinating journal that pledges "to
acquire the kind of articles that give inspiration for reflection
--for metacognitive understanding." Don't let the fancy language
scare you off, this is a great journal to promote deeper
understanding of the possibilities of multimedia. The articles
often have catchy titles (my favorite title in the Summer 1990
issue was "Elements of a Cyberspace Playhouse" by Randal Walser),
and are written by industry and academic experts in the field of
multimedia systems.
If the decade of the nineteen eighties was the era when the
"personal computer" revolution came about, then the nineties may
be the decade of the "personal simulator" revolution, and
Multimedia Review may be its harbinger. Articles like Scott S.
Fisher's "Virtual Environments: Personal Simulations &
Telepresence" (Summer 1990 issue also) discuss current state-of-
the-art systems in the historical context of what the designers
are aiming for in the long run. As fact follows fancy we may all
one day find ourselves working in virtual workspaces like William
Gibson imagined in fiction, and Autodesk corporation has now
implemented in actuality.
Bringing It All Home
A final anecdote may bring multimedia closer to home for you, as
it did for me. As I was preparing to leave work today (eager to
get home and finally finish this overdue column!) I took a break
to try out a new computer that had appeared in the evaluation
center of our campus computing center.
It was a Silicon Graphics workstation, and as I logged on to the
machine and explored some of its demo software packages I was
staggered by the real-time animation capabilities of the machine.
In twenty minutes, I had run through a fractal display system, an
amazingly realistic flight simulator (it makes the latest version
of Microsoft's Flight simulator look sick), a hilariously real
looking interactive simulation of a Jello icosahedron bouncing
around a room, a design tool for studying wave oscillation
phenomena in surfaces, and a dazzling graphical visualization of
a mechanical insect that obediently crawled after my cursor
wherever I led it.
+ Page 107 +
The image animation windows in all these applications were razor
sharp, the kind of crispness that one sees in computer generated
movie sequences like The Last Starfighter and Tron. The insect
automaton moved realistically and cast a shadow. The illusion of
depth and reality was dramatic.
My point is that within this decade simulation technology like
this will be on all our desktops! Interactive multimedia and
hypermedia are technologies of the near future, and we librarians
had better become accustomed to them and think about them before
we are caught off guard. Besides, they are fun. I know I want
another crack at that F-15 flight simulator. Perhaps next time
I'll remember to bring up my landing gear so they don't get torn
off at Mach 2.
Books Reviewed:
Nelson, Theodor H. Computer Lib/Dream Machines (Rev. Ed.).
Redmond, Washington: Microsoft Press, 1987.
(ISBN 0-914845-49-7)
Shneiderman, Ben, and Greg Kearsley. Hypertext Hands-On!: An
Introduction to a New Way of Organizing and Accessing
Information. New York: Addison-Wesley, 1989.
(ISBN 0-201-15171-5)
Journals Reviewed:
BYTE 15, No. 2 (February 1990).
(ISSN 0360-5280)
CD-ROM Professional 3, No. 5 (September 1990).
(ISSN 1049-0833)
Communications of the ACM 32, No. 7 (July 1989).
(ISSN 0001-0782)
Computer Graphics 24, No. 2 (March 1990).
(ISSN 0097-8930)
Hypermedia 1, No. 3 (1989).
(ISSN 0955-8543)
Multimedia Review 1, No. 2 (Summer 1990).
(ISSN 1046-3550)
SIGIR Forum 22, No. 1-2 (Fall 1987/Winter 1988).
(ISSN 0163-5840)
+ Page 108 +
About the Author
Martin Halbert
Automation and Reference Librarian
Fondren Library
Rice University
Houston, TX 77251-1892
(713) 527-8181, ext. 2577
BITNET: HALBERT@RICEVM1.RICE.EDU
-----------------------------------------------------------------
The Public-Access Computer Systems Review is an electronic
journal. It is sent free of charge to participants of the
Public-Access Computer Systems Forum (PACS-L), a computer
conference on BITNET. To join PACS-L, send an electronic mail
message to LISTSERV@UHUPVM1 that says: SUBSCRIBE PACS-L First
Name Last Name.
This article is Copyright (C) 1990 by Martin Halbert. All
Rights Reserved.
The Public-Access Computer Systems Review is Copyright (C) 1990
by the University Libraries, University of Houston. All Rights
Reserved.
Copying is permitted for noncommercial use by computer
conferences, individual scholars, and libraries. Libraries are
authorized to add the journal to their collection, in electronic
or printed form, at no charge. This message must appear on all
copied material. All commercial use requires permission.
----------------------------------------------------------------
+ Page 30 +
-----------------------------------------------------------------
Loney, George. "The University of Guelph Library's SearchMe
Public-Access Catalogue." The Public-Access Computer Systems
Review 1, No. 3 (1990): 30-43.
----------------------------------------------------------------
1.0 Introduction
The University of Guelph is a medium-sized university located in
southwestern Ontario about 100 kilometers from Toronto. The
library has been automating its various systems since the mid-
1960s, starting with electronic data collection devices for a
batch-oriented circulation system. The systems that followed
included a batch cataloguing system called Scope, the CODOC
system, and the Geac online circulation system (co-developed with
Geac). The Geac circulation system was expanded to include
online public access, acquisitions, and cataloguing, all running
on the Geac mini-computers.
In 1987, the University of Guelph Library began a pilot project
to determine the viability of individual CD-ROM workstations as a
replacement for its centralized online catalogue. This storage
medium for the nearly 900,000 record bibliographic database was
chosen because it offered an extremely cost-effective method of
distributing the 500-megabyte database to what is projected to be
a network of over 100 workstations.
The original version of the search software and database was the
product of a commercial vendor. The pilot project determined
that while CD-ROM was an acceptable medium for storing and
retrieving the data, the software used during the pilot project
was not desirable for the long term, and the inability to change
the database would require frequent and costly remasterings.
As a result, a database design was developed and tested that
would allow the library to write its own search software, prepare
its own database, deal directly with the CD-ROM manufacturers at
a greatly reduced cost, and add changes to the CD-ROM data. This
software project was started in May 1988, and the new system was
installed in October 1988 on 25 workstations throughout the
library. Since then, the system has completely replaced the old,
centralized online public access system and is running on 85
workstations in the two library branches and on a few additional
workstations in some academic departments.
+ Page 31 +
This article will examine some of the issues surrounding the
development of the SearchMe software relating to the user
interface and implications of the use of the CD-ROM as the major
storage medium.
2.0 User Survey
Prior to the development of SearchMe, a survey of library users
was conducted by the systems staff with the help of the reader
service staff. Patrons were approached while they were using one
of the publicly available search tools: the card catalogue (we
still had one at the time), the online public access system, or
the circulation inquiry system. Questions were asked to
determine what information patrons had when they started a
search, what it was they were looking for, and how well or poorly
the current search tools satisfied them.
A number of conclusions were inescapable:
1. Patrons learn how to use the library systems through different
means, but self-teaching is the most usual method.
2. While patrons are concerned if system response time is slow,
they become very frustrated when response time is inconsistent,
e.g., when use is heavy.
3. Patrons migrate very easily from the card catalogue to
computer-supported search tools. The only difficulty with the
search tools is that many terminals or workstations are needed to
prevent line ups.
4. Patrons using the automated search tools perceived that they
had found most or all the available information. We were never
able to establish how they knew that they had found "all" the
information, but it was indicative of their perception that they
were being adequately helped.
As a result of this and other knowledge sources, we developed a
design goal where the new system would attempt to:
1. Provide highly consistent response times no matter how high
the user load or how many terminals were in the overall system.
2. Provide high functionality first and high speed second.
3. Be very consistent in its user interface and as intuitive as
possible in its control functions.
+ Page 32 +
4. Provide context-sensitive help at all stages of system use.
5. Allow the novice user to become familiar with the search
system with minimum formal instruction and permit the more
experienced user to perform more complex searches.
6. Be very accurate in its information delivery and highly
tolerant of user input error.
We believe that SearchMe is very successful at meeting these
goals.
3.0 Consistent Response Time
SearchMe operates in a functionally distributed environment.
Each workstation at the University of Guelph Library consists of
a PC/XT clone with a 10 or 12.5 MHz 8088 chip, 640 KB of memory,
an Ethernet card, one floppy drive, a 40 MB hard drive, an
internal CD-ROM player, a monochrome monitor, and a rugged
keyboard. There is a custom, lockable, front panel that covers
the hard disk and CD-ROM player openings as well as blocks off
the reset and turbo switches.
The minimum hardware requirement for SearchMe is an XT with 640
KB of memory and a single floppy drive. The software will take
advantage of colour monitors if they are present, and it will
alter certain display characteristics for colour monitors.
To reduce dependence on server or other response-time bottlenecks
in the LAN, we make little use of the local area network.
Changes to the catalogue database are transported automatically
to the workstations during the night via the LAN. The
workstation detects and transports software changes on start up,
and requests for circulation information about patron or
bibliographic records are handled by the LAN. If the LAN or the
server is inoperative, the software recognizes this condition,
and the affected functions are simply declared unavailable.
+ Page 33 +
The key to consistent response times is the fact that each
workstation contains the entire library catalogue database and
its indexes on one resident CD-ROM. There is a limit of about
600-650 MB of data that can be put on a CD-ROM. We have our
entire collection of about 900,000 bibliographic records on one
CD-ROM disc, and we believe we can expand our database to about
1.2 million records without adding a second CD-ROM. If this
possibility occurred (rather remote given current acquisitions
budgets), we have several options: the text data could be
compressed to reduce the amount of space required, machines could
be twinned to share CD-ROM players, or machines could be
clustered around a data server.
Another advantage of a self-contained system is that functions
that could previously be provided to users only with large (and
expensive) centralized processors, are now possible with a
microcomputer-based CD-ROM system since the computing resource is
not shared by anyone else. Boolean searches on large collections
of data can be provided with no penalty to the rest of the
system.
4.0 Functionality
As many functions as possible were considered in the design of
SearchMe. Functions were rejected only if they were too
complicated or were useful to only a very small group of users.
As a result, the types of searches available on the system are:
(1) full title search, (2) full author search, (3) full call
number search, and (4) subject search. Subject search allows
patrons to access data using: (1) titles; (2) corporate and
personal authors; (3) call numbers; (4) Library of Congress
Subject Headings; (5) material type names in the detailed
holdings statements; (6) location names from the detailed
holdings statements; (7) collection names; (8) any word from
either the title, author, or subject heading fields; and (9) any
word from most places in the record. These access points can be
combined using the Boolean operators "AND," "OR," or "NOT."
The full title, author, and call number searches allow a simple,
single phrase search that our survey showed most people use to
find much of the material they want.
A further feature allows users to shelf browse forward and
backward from any record they find. This capability closely
corresponds to browsing the actual shelf because the database is
organized in shelf sequence.
+ Page 34 +
Users may also display search results on the screen, request a
printout of the results, or save them as an ASCII file on a
floppy diskette. Users may customize the output as they wish,
and they may print, display, or save any result record.
In addition, the system can link directly to current circulation
status information so that users may request display of their own
current biographical information, including items on loan,
overdue fines, outstanding holds, and available holds. The
system allows patrons to place holds on items and will
automatically transfer them to the Circulation System.
5.0 Consistent User Interface
In keeping with current user interface practice, a highly
consistent interface has been implemented.
The top of the screen is used to display messages about the
current status of the search in progress; the middle of the
screen is used to display index lists, search strategies, search
results, and detailed help; and the bottom of the screen contains
short directions to the user and error messages.
No key is used for two different kinds of function command, and a
special set of coloured key caps has been installed with
customized legends (e.g., find by title, find by author, help,
next record, and previous record). Assigning custom key caps has
freed the screen for anecdotal directions (e.g., "Press one of
the blue keys to start your search") instead of messages that are
concerned solely with keyboard use. The largest key on the
keyboard, coloured bright red, is the help key. When a user
presses this key, a window pops up containing a description of
the current screen. The amount of text that can be displayed in
this window is unrestricted.
+ Page 35 +
6.0 Learning the System
As our survey found, there are many ways that people learn
to use the system. At the beginning of each semester, the
library provides orientation classes that cover all the
facilities available to our patrons. However, many users simply
sit down at a terminal and start to use the system. As a result,
we have made specific provisions for this type of approach. Use
of the system itself is largely intuitive; the commands are
printed right on the key caps. Using these and the screen
prompts, many patrons can start doing simple searches without any
previous instruction. Located at the various workstations are
one-page instruction sheets that explain the purpose of the
function keys and the contents of the index access points. Also
available are scripts that lead the user through a sample search.
7.0 Searching
To perform a simple search, the user presses the Find By Title
key, the Find By Author key, or the Find By Call Number key (see
Figure 1).
-----------------------------------------------------------------
Figure 1. Main Screen
-----------------------------------------------------------------
The University of Guelph Library
Catalogue Access System
Press one of the blue keys to do a simple search. Press the
subject search key to perform combined term searches or to
access indexes other than the title, author, or call number.
-----------------------------------------------------------------
+ Page 36 +
The system then prompts the user to enter the title (see Figure
2), author, or call number and press Enter.
-----------------------------------------------------------------
Figure 2. User Is Prompted to Enter a Title
-----------------------------------------------------------------
The University of Guelph Library
Catalogue Access System
Enter Title:
Type in the text that you wish to find and press the enter key.
The system will search for the closest match to the text that
you have entered.
-----------------------------------------------------------------
When the Enter key is pressed, the system uses the search term
entered to place the user as close as possible to the desired
index entry (see Figure 3). Users can press the cursor control
keys around the list of index entries until they have located the
correct title, author, or call number. Then they can press the
Display Result, Save Result, or Print Result keys to view, dump
to diskette, or print the records.
-----------------------------------------------------------------
Figure 3. List of Titles; Second Line Highlighted
-----------------------------------------------------------------
The University of Guelph Library
Catalogue Access System
Enter Title: the large scale structure of space
+-Index List-------------------------------------------------+
Large-Scale Sharing of Computer Resources 1
Large-Scale Structure of Space-Time 1
Large-Scale Structure of the Universe 3
Large-Scale Structures in the Universe 1
Large-Scale Superimposed Folds in Precambrian Rocks of... 1
Large-Scale Systems Modelling
+------------------------------------------------------------+
Press the up, down, PgUp, PgDn keys to manipulate the index
list. Display a highlighted record with the display result
key. Press help for more information.
-----------------------------------------------------------------
+ Page 37 +
While displaying records (see Figure 4), users can press the Page
Up and Page Down keys to view multi-screen records. The Next
Record and Previous Record keys are used to display other records
in the result, and the Browse Forward and Browse Backward keys
allow users to shelf browse around a specific result record. The
red Undo/Esc key moves the user back one step at a time, and the
Start Again key cancels everything and returns the screen to the
beginning (see Figure 1). Any of the Find By keys also stop
everything and start a new search.
-----------------------------------------------------------------
Figure 4. Selected Record
-----------------------------------------------------------------
The University of Guelph Library
Catalogue Access System record 1 of 1
+-Bibliographic Window----------------------------------------+
Call Number QC 173.59.S65 H38
Title The Large Scale Structure of Space-Time
Author Hawking, S. W.
Edition Cambridge (Eng.) University Press, 1973
Contents Bibliography: p.373-380
Series Title Cambridge Monographs on Mathematical Physics
Detailed Holdings:
Cpy Location Mat'l Type Call Number
1 Science Book QA 173.59.S65 H38
+-------------------------------------------------------------+
Press cursor key to see more of the record. Press next or
previous record to look at other records in the set. Press a
browse key to browse forward or backward from this record.
-----------------------------------------------------------------
The Subject Search key initiates complex searches (see Figure 5).
The user is asked to choose an index from a list of All Keywords,
Title Keywords, Author Keywords, Subject Heading Keywords,
Titles, Authors, Subject Headings, Call Numbers, Material Types,
Locations, and Collection Names. The system then prompts the
user to enter the appropriate text and press the Enter key.
After the search is conducted, the user is shown a list of index
entries with the closest entry highlighted. The user selects a
specific entry by moving the highlight around with the Up, Down,
Page Up, and Page Down keys and then presses the Enter key.
+ Page 38 +
-----------------------------------------------------------------
Figure 5. Select Initial or New Access Point
-----------------------------------------------------------------
The University of Guelph Library
Catalogue Access System
+-Select Access Point-----------------------+
All Keywords
Title Keywords
Author Keywords
LCSH Keywords
Material Type
Location
Collection Name
Title
Author
Library of Congress Subject Heading
+------------------------------------------+
First, select an access point by pressing a cursor key to more
through the list and pressing the enter key.
-----------------------------------------------------------------
At this point, the procedure differs from the simple searches. A
"Search Criteria" window opens, and the selected index entry
moves into it. The system also builds a list of current result
records and shows the user how many records are in it. Users can
view, print, or save the results at any time, or they can
continue to refine their results.
To refine their results, patrons can enter another term and
combine it with the previous terms by pressing the AND, OR, or
NOT keys. Or, they can press the Change Index key to select any
access point, enter another search term, and combine it with the
previous results. The system maintains a continuous display of
the search strategy and the result count (see Figure 6). Users
can remove terms from the search by pressing the Undo key or
delete the search by pressing the Start Again key.
+ Page 39 +
-----------------------------------------------------------------
Figure 6. Multiple Access Point Combined Term Search
-----------------------------------------------------------------
The University of Guelph Library current result: 1
Catalogue Access System
Enter Keyword from Author: Hawking
+-Index List-------------------+ +-Search Window-------------+
Hawkin 1 (keyword: space) AND
Hawking 6 (keyword from author:
Hawkings 352 hawking
Hawkins 1
Hawkins-Whitehead 1
Hawkinson 3
Hawkridge 2
Hawks 37
Hawksley 3
Hawksworth 35
+------------------------------+ +---------------------------+
ENTER to add the highlighted entry to your search. Press OR,
AND, or NOT to combine terms. DISPLAY RESULT to see current
result of search. CHANGE INDEX to switch index.
-----------------------------------------------------------------
8.0 Current Information
The major problem with systems that use CD-ROM as their data
storage medium has been the inability to update the databases.
As a result, CD-ROMs have tended to be used only in widely
distributed, static database applications. At first glance, it
would seem that a library catalogue is relatively static; after
all only about five per cent of the database changes in one year.
However, that five per cent represents over 40,000 records for a
medium-sized university collection--a large number of changes by
any measure.
In our case, SearchMe was replacing a true online system. As
changes were made to the database, they were immediately
available to library patrons at the public terminals. The new
system would have to be able to be updated on a regular, timely
basis. SearchMe meets this objective in the design of its index
system and its hardware configuration.
+ Page 40 +
Each workstation is connected to an Ethernet LAN. Periodically,
when the workstation is otherwise inactive, it checks the central
data server to see if there are any changes to the database. If
so, the changes are copied into the workstation's hard disk.
These changes are logically merged into the original CD-ROM
resident database so that the library patron never actually knows
whether the information is being delivered from the database
changes or the original database.
9.0 Error Tolerance
There are two aspects to the system's ability to be tolerant of
user errors: (1) how does it deal with incorrect control function
commands, and (2) how does it react when search text is
misspelled?
In the first case, the system generates error messages that
attempt to inform users that they have made an error and why.
In the second case, the data retrieval software converts upper-
case characters to lower case in both the entered text and the
indexed text. Any punctuation (except for the call number) is
changed to a space, and multiple occurrences of spaces are
compressed to one space. In the title index, certain leading
words (i.e., "the," "le," "la," and "les") are dropped unless
they are the only word that was entered. Quite often, misspelled
words will still result in the correct index entry display since
the index mechanism attempts to find the entry that is "close to"
the search term.
10.0 Technical Details
Workstation software is written in the C language. We currently
use Borland's Turbo C version 2.0. The screen management, text
manipulation, and indexes were all written by our staff. The
database generation software is also written in C and runs in the
Unix environment. It, too, was written entirely by our staff.
We currently send the prepared data to Discovery Systems in
Columbus, Ohio, to have the CD-ROM discs made.
+ Page 41 +
The indexing scheme, also designed by our staff, is very
efficient in its use of space and provides excellent response
times. Our current bibliographic database uses 331 MB of space.
The total space used by all the indexes is 211 MB. Data indexed
includes (1) 1,424,000 titles; (2) 602,500 authors; (3) 286,000
subject headings; (4) 829,000 call numbers; (5) 651,000 keywords;
(6) 71,000 ISBNs and ISSNs; (7) 206,000 L.C. Card Numbers; (8)
location names; (9) material types; and (10) collection names.
We do not use character compression, although if space became a
problem, we could.
The CD-ROM disc is a very good device for serial access. It
transfers data at much the same rate as a good hard disk;
however, it is a slow random access device. For instance, the
average seek time of a hard disk is about 30 ms whereas the CD-
ROM needs between 270 and 340 ms. For this reason, the indexing
scheme is optimized for the peculiarities of the CD-ROM medium
(fewer than two disc seeks are required to go from search term
entry to the closest occurrence of the term). Another two seeks
are required to access the complete bibliographic record and
display it on the screen. The system also attempts to predict
user behaviour and pre-read data in order to speed the process
even more. When run with a hard disk for storage, the software
works very well and has extremely good response time.
Because of the ability to update data on the CD-ROM, we normally
create a new CD-ROM version only every eight months or so. This
process costs us about $2,000 US for 300 copies of the disc.
Every possible part of the SearchMe software was put into
parameters. The parameters are pre-loaded and optimized so that
SearchMe does not have to interpret the data. The parameters are
loaded by a programme that runs under MS-DOS and checks that they
are accurate and viable. Some of the features that are
controlled by parameter are:
o Size and location of the data display windows, plus the kind
of outline and title of the window (if any).
o Colour of the window outline, background, and text, and the
colour and other attributes (e.g., flash and reverse video) of
highlighted text. The programme alters these values if the
system is using a monochrome monitor.
o Prompts, error messages, field names, and help messages.
o Format of the bibliographic record display.
+ Page 42 +
o Whether or not commands will be entered using the special
keyboard or pull-down menus.
o The content of the pull-down menus.
SearchMe supports multiple databases. The databases can be
stored on CD-ROM, internal hard disk, or centralized (or
distributed) data server. If there are multiple databases, users
are given a menu of available databases and asked which one they
wish to access. Each database uses its own parameter file so it
is possible to configure each one quite differently from the
others. Using multiple parameter files (which contain the
prompts and other instructional text), it is possible to support
multilingual applications by creating a parameter file for each
language, where they all reference the same bibliographic data.
11.0 The Future
SearchMe is only the first phase of the complete rewrite of our
online library system.
In December 1989, the cataloguing system was installed using the
same type of architecture--distributed microcomputers accessing
the main catalogue from a centralized server. With this
approach, a highly sophisticated set of tools is available to the
cataloguer, such as full-screen editing, interactive error
detection, online coding manual, and online syntax checking. As
with the SearchMe catalogue access, the system is highly reliable
because it is not necessary for the central server to be
available for work to continue.
We are about to add a binding module to the system. The basis of
our authority control system is already included in the system,
and this will be fully implemented later this year. Work is just
starting on the development of our new circulation system after
which we will add acquisitions and serials control.
We have also experimented with a low-cost optical scanner that we
will use to scan and translate contents pages of incoming
journals. From this, a SearchMe database of our journals,
indexed by title, author, and keyword, will be maintained.
+ Page 43 +
12.0 Summary
The advent of high-capacity, inexpensive, personal storage
devices such as CD-ROM has made the development of practical,
large database workstations possible. The movement away from a
centralized super-mini or mainframe computer to functionally
distributed microprocessor workstations has allowed the
University of Guelph Library to provide a highly functional,
cost-effective, flexible catalogue access system. Ultimately, it
will offer us the ability to move much more quickly to take
advantage of technological changes that benefit our user
community.
About the Author
George Loney
Staff Analyst
University of Guelph Library
Guelph, Ontario N1G 2W1
Canada
BITNET: GLONWY@COSY.UOGUELPH.CA
-----------------------------------------------------------------
The Public-Access Computer Systems Review is an electronic
journal. It is sent free of charge to participants of the
Public-Access Computer Systems Forum (PACS-L), a computer
conference on BITNET. To join PACS-L, send an electronic mail
message to LISTSERV@UHUPVM1 that says: SUBSCRIBE PACS-L First
Name Last Name.
This article is Copyright (C) 1990 by George Loney. All
Rights Reserved.
The Public-Access Computer Systems Review is Copyright (C) 1990
by the University Libraries, University of Houston. All Rights
Reserved.
Copying is permitted for noncommercial use by computer
conferences, individual scholars, and libraries. Libraries are
authorized to add the journal to their collection, in electronic
or printed form, at no charge. This message must appear on all
copied material. All commercial use requires permission.
----------------------------------------------------------------
+ Page 51 +
-----------------------------------------------------------------
Manojlovich, Slavko. "Mounting Commercial Databases Using the
SPIRES DBMS." The Public-Access Computer Systems Review 1, No. 3
(1990): 51-57.
-----------------------------------------------------------------
1.0 Introduction
Commercial databases like ERIC, DISSERTATION ABSTRACTS, and
INSPEC have been publicly accessible through the various online
search services for over 20 years. A relatively small number of
universities and other institutions have acquired and mounted
some of these databases on their local database management system
(DBMS) for at least as long a period of time. A fairly recent
phenomenon is the general belief and/or demand that universities
should be locally mounting a variety of commercial databases.
For those institutions with integrated library systems, the
demand for locally accessible commercial databases is going one
step further with the demand that access to these databases
somehow be integrated with access to the library's catalogue.
Integration can mean either the use of a common interface for
searching both the catalogue and other databases or the creation
of a link between the commercial databases and the library's
serial holdings as reflected in the catalogue. The vendors of
integrated library systems are beginning to respond to this new
demand by offering their customers pre-loaded commercial
databases which can reside along with the library's catalogue and
be accessed using a common interface. Pre-loaded databases are
similar to CD-ROM databases in that the data have been
prepackaged by the vendor for consumer use. Issues surrounding
the packaging of the data, such as the number and type of access
points (i.e., indexing) and the data output formats, are
important only when comparing databases from different vendors.
The customer typically has no control over the manner in which a
commercial database is accessible through a vendor's integrated
system.
Commercial databases on CD-ROM or pre-loaded by a vendor may not
be suitable for many institutions because of expensive licensing
fees, limited access, or just poor packaging of the data.
Another alternative to acquiring commercial databases on CD-ROM
or from an integrated library system vendor is to purchase the
databases on magnetic tape and mount them using a DBMS such as
SPIRES, BRS, or BASIS.
+ Page 52 +
Stanford University, Rensselaer Polytechnic Institute, and
Memorial University of Newfoundland use the SPIRES DBMS
(developed by Stanford University) to provide access to both the
library catalogue and to commercial databases. Princeton
University, Syracuse University, University of British Columbia,
Simon Fraser University, and other institutions use SPIRES to
provide access to GPO, ERIC, COMPUSTAT, PSYCHINFO, GROLIER
ACADEMIC AMERICAN ENCYCLOPEDIA, and other commercial databases.
The remainder of the article will describe various issues
associated with the local mounting of commercial databases and
how SPIRES addresses and accommodates these issues.
2.0 Analyzing and Loading a Commercial Database
Except for the U.S. MARC Communications Format there are no
existing standards for the dissemination of commercial databases.
A survey of a small number of commercial databases reveals that
databases distributed on magnetic tape are written using either
the ASCII or EBCDIC character set. They may be comprised of
fixed or variable length records, and they may or may not
represent diacritics following the American Library Association's
standard. Given that these databases can be characterized as
containing full-text, numeric, bibliographic, or other types of
data, even the identification of a "record" or a "field" is not
that straightforward. For example, what constitutes a record in
ISI's CURRENT CONTENTS database? Is it the journal issue or the
article within the issue? In the GROLIER ACADEMIC AMERICAN
ENCYCLOPEDIA database a paragraph of an article and not the
article constitutes a record.
The loading of the database is the transformation of the original
data into a format required by the DBMS. During the initial
examination of the data the analyst is formulating a model of how
the data will be represented in the DBMS. The primary factor
determining how the data are stored is the DBMS's ability to
accommodate the data. For example, MARC records contain the
hexadecimal character code '1F' to indicate the start of a
subfield or may contain hexadecimal characters representing
diacritics. If the DBMS cannot store these characters, some form
of data transformation must take place. The same is true of
graphic images.
+ Page 53 +
Ideally, the DBMS should preserve the original content of the
data as supplied by the database vendor. The SPIRES load
procedure is designed to accommodate the broad spectrum of data
types supplied by commercial database vendors. Following the
creation of a description of the database for SPIRES (i.e., the
"file definition") there are two ways to "batch" load a database
into SPIRES: writing a computer program to convert the data to
the SPIRES input format or writing an input load procedure using
SPIRES formats language.
2.1 Writing a Computer Program to Convert the Data
The first method of loading data is to write a computer program
that will convert the original data into SPIRES "input format."
SPIRES input format identifies the start and end of a record,
field, subfield, etc. A sample entry for the 245 MARC tag would
be as follows:
245 = (10 aGone with the Wind.);
In this example, "245" is the field name, the parentheses
surround the value of the field, and the semi-colon is the end-
of-field terminator.
SPIRES will load anything found within the parentheses including
the hexadecimal code "1F," which is stored after the "0" in the
above example.
2.2 Writing a Load Procedure Using the SPIRES Formats Language
The second method of loading data is to write a input load
procedure using the SPIRES formats language. This load procedure
will read in data from an external file and parse it into
records, fields, subfields, etc. For an application which
requires a lot of coding or parsing (e.g., a MARC record) it is
probably easier to write a computer program using PL/1 than to do
the equivalent using the SPIRES formats language.
+ Page 54 +
3.0 Indexing
SPIRES provides the entire range of indexing options available in
most DBMSs, including keyword, phrase, date, and coded indexes.
SPIRES also provides a "personal name index" which is designed to
accommodate simultaneously both a "first name surname" and
"surname, first name" name search. A search for "John Smith" or
"Smith, John" will both retrieve the same records in a personal
name index search. Index names can have aliases associated with
them. For example, someone accustomed to always using "FIND
NAME" to search for individuals in every database can have "NAME"
added as an alias for a "FIND ARTIST" search in a fine arts
slides database or as an alias for "FIND FONDS" search in an
archival and manuscripts database. ("FONDS" is the equivalent of
"MAIN ENTRY" for archivists.)
In the creation of an index, you specify to SPIRES the fields
which will be included in the index. You also specify through
actions called "PASSPROCS" how the index term will be created
from the input data. For example, you can specify a list of stop
words (terms which will not be indexed), or indicate that you
don't want to include punctuation in the index term.
Another important feature of SPIRES involves the ability to
transform an index file into a separate database and associate
additional information with each index record entry.
In addition, SPIRES uses action statements called SEARCHPROCS
that allow you to take a search term and process it through, for
example, a thesaurus file, to determine the proper form of the
search term. The SPIRES $REPARSE SEARCHPROC will then take this
converted search expression and execute it. The use of
SEARCHPROCS and $REPARSE to process and transform search
statements is one of the methods of creating database linkages in
SPIRES. Database linkages result in the delivery of value-added
packaging of information.
+ Page 55 +
Consider the following example of the implementation of the
EXPLODE command on a sample MEDLINE file at Memorial University
of Newfoundland. The EXPLODE command enables you to retrieve all
the subordinate subject entries associated with a Medical Subject
Heading (MeSH) term. MeSH terms are part of a hierarchical
subject classification. An index is created from the MeSH
database with the heading being the key of the record. Each
index record also contains a concatenated list of MeSH tree
numbers associated with the heading. When a patron performs an
EXPLODE search (e.g., "FIND EXPLODE ABO FACTOR") on the MEDLINE
bibliographic database SPIRES first looks up the heading in the
MeSH heading index, retrieves a list of MeSH tree numbers, and
appends a truncated search character to each tree number. This
OR'd list of tree numbers is passed back to SPIRES, which then
re-executes a new search on the tree number index which is built
from the MEDLINE database.
The above model of database linkages can be applied to any
commercial database which has an associated machine-readable
thesaurus or classification system (e.g., ERIC and PSYCINFO). It
is also useful in multilingual database applications where a
multilingual dictionary could be used by SPIRES to transform a
search term into an OR'd set of corresponding search terms for
each language. For example, a "FIND SUBJECT SOCIAL SCIENCES"
search in the MICROLOG (Canadian Research and Report Literature)
database would also retrieve all of the french records with the
term "SCIENCES SOCIALES."
4.0 Data Output
SPIRES data output, as with indexing and searching, has
associated with it a range of actions which enable you to
transform the data as per your requirements. SPIRES provides an
almost unlimited variety of ways to output your data, including
formatting reports with statistical calculations. Within the
SPIRES FOLIO environment, the patron simply specifies the type of
output by including a "format name" following the DISPLAY
command.
+ Page 56 +
SPIRES formats can do much more than simply provide brief, full,
or MARC output. If the patron's workstation on a network can
accommodate the display of diacritics, the user can specify a
format which includes these characters. A format can also look
up and display information from a database other than the one
being searched. This ability provides the framework for linking
journal holdings information to commercial databases. As part of
displaying a citation, the format looks up the journal title,
ISSN, or other key in a file containing a list of journals held
by the library and adds a holdings status message.
The SPIRES SAVE command allows you to write the formatted results
of a search to a file. The SAVE command enables a patron to
search a numeric database (e.g., COMPUSTAT) and output the data
for input to a statistical package. Similarly, it allows users
of a full-text database to output a true reproduction of an
article, in contrast to obtaining a copy of the article using the
screen dump procedure. Finally, it can be used to output
bibliographic records for input to a micro-based DBMS.
5.0 Conclusion
The SPIRES DBMS has served librarians for over a decade. It is
now used primarily to create local databases and to mount
commercial ones. Because of SPIRES ability to handle MARC
records, institutions like Rensselaer Polytechnic Institute and
Memorial University of Newfoundland are developing fully
functional integrated library systems with linkages to commercial
databases. SPIRES functionality and versatility as illustrated
in this article insure that SPIRES will continue to meet the
evolving needs of the library community.
+ Page 57 +
About the Author
Slavko Manojlovich
Assistant to the University Librarian for Systems and Planning
Memorial University of Newfoundland
St. John's, Newfoundland A1B 3Y1
Canada
BITNET Address: SLAVKO@KEAN.UCS.MUN.CA
-----------------------------------------------------------------
The Public-Access Computer Systems Review is an electronic
journal. It is sent free of charge to participants of the
Public-Access Computer Systems Forum (PACS-L), a computer
conference on BITNET. To join PACS-L, send an electronic mail
message to LISTSERV@UHUPVM1 that says: SUBSCRIBE PACS-L First
Name Last Name.
This article is Copyright (C) 1990 by Slavko Manojlovich. All
Rights Reserved.
The Public-Access Computer Systems Review is Copyright (C) 1990
by the University Libraries, University of Houston. All Rights
Reserved.
Copying is permitted for noncommercial use by computer
conferences, individual scholars, and libraries. Libraries are
authorized to add the journal to their collection, in electronic
or printed form, at no charge. This message must appear on all
copied material. All commercial use requires permission.
----------------------------------------------------------------
+ Page 77 +
-----------------------------------------------------------------
Molholt, Pat. "The Libraries at Rensselaer Implement Access to
Information Beyond Their Walls," The Public-Access Computer
Systems Review 1, no 3. (1990): 77-82.
-----------------------------------------------------------------
1.0 Introduction
Rensselaer Polytechnic Institute began automating its libraries
some ten years ago. The choice of SPIRES was driven both by its
functionality and its cost. With no increased funding available
for automation, the library administration sought a tool that
afforded maximum control over the development of systems while,
at the same time, had a manageable price tag.
Currently, our system, which has the trademarked name "InfoTrax,"
has nine sub-systems. SPIRES has successfully handling every
challenge we have put to it in this complex system development
effort.
These accomplishments were shepherded through the design,
implementation, and evaluation processes by a design team of four
librarians and a programmer/analyst. One programmer/analyst has
been entirely responsible for the programming and maintenance of
our system. Three individuals have held that position over the
years with no loss to our progress in the transitions.
2.0 InfoTrax Subsystems
InfoTrax has the following subsystems: (1) Acquisitions, (2)
Catalog, (3) Circulation, (4) Commercial Index and Abstracts, (5)
Library News, (6) Message, (7) Reserves, (8) Serials Check-In,
and (9) Campus Information (this is described in section 3.0).
Although the general system is freely accessible and requires no
passwords, several of the files do require Rensselaer
affiliation. When users access a restricted file they are
prompted for an authorization code. The commercial index and
abstract files, IEEE and Current Contents, fall in this category.
+ Page 78 +
2.1 Acquisitions Subsystem
The Acquisitions subsystem includes fund accounting and an
interface to Rensselaer Polytechnic Institute's accounts payable
system. Orders are generated by the system and records for items
on order are listed in the catalog.
2.2 Catalog Subsystem
The Catalog Subsystem merges all MARC record types in one file.
This file can be searched with full Boolean logic applied to
numerous fields, including author, title, subject, publisher,
date, subject, collection, call number, material type (e.g.,
journal, conference, and software), and status (in circulation or
available).
2.3 Circulation Subsystem
In the Circulation Subsystem, item level records are linked to
the catalog with real-time updating of circulation activity,
including relocating items to the reserve collection and the
transfer of whole call number ranges to a different library. In
addition, the floor and sub-collection are noted for each item in
the collection.
2.4 Commercial Abstract and Index Subsystem
The Commercial Abstract and Index subsystem contains citation
files that are linked by call number to the Catalog subsystem.
Patrons can use "Photocopy" and "Interlibrary Loan" commands to
electronically route their requests for materials found in
citation files to the appropriate library unit.
2.5 Library News Subsystem
The Library News subsystem contains the library's hours and
service announcements.
+ Page 79 +
2.6 Message Subsystem
The Message subsystem is used for acquisitions recommendations,
reference questions, and other types of patron requests. Users
from around the United States and several foreign countries have
used MESSAGE to offer critiques of the system or ask for users
assistance. Fortunately, no one has tried to use it for direct
borrowing requests. We'd have to say no to them at this point.
2.7 Reserves Subsystem
The Reserves subsystem records class lists of both library and
non-library materials that are searchable by course name or
number, course nickname, and instructor. Non-library materials
are organized by folders with the contents listed for easy
identification by users.
2.8 Serials Check-In Subsystem
The Serials Check-in subsystem interfaces between the MicroLinx
system and the catalog, providing issue level availability
information in the catalog. Each night the day's check-in
activity is automatically transferred between the networked
microcomputer and the mainframe-based InfoTrax system.
3.0 Campus Information
Campuses are rife with information that is critical to students,
faculty, and staff. Good access to that information has been a
long standing problem for many of us. Campus-Wide Information
Systems (CWIS) are springing up in an effort to bring both
control and organization to a wide range of internal information.
Librarians have not typically taken a leadership role in these
efforts even though, among campus professionals, librarians are
singular in their training in the organization of information.
In this context, for the past eighteen months the library's
design team has turned its attention to the concept of a library
without walls by opening up the definition of "library
information." Specifically, the group has begun working with
several campus units to bring existing information of broad
campus interest into the InfoTrax system for dissemination.
+ Page 80 +
3.1 Telephone Directory File
Our first project was the campus student, faculty, and staff
telephone directory. Compiled from the Registrar's, Human
Resources', and Telecommunications' files, the Telephone
Directory File is searchable by name, department, building, and
rank or school year.
Individuals will be able to "update" their own records in this
file. In actuality, the requested changes will move
electronically, field by field, to the office responsible for
maintaining the authoritative file for that information segment.
The actual corrections will be fed back into a central file for
the campus to draw on as needed. No more changing your address
in six different places.
3.2 Undergraduate Research Program File
The next file we mounted takes its structure from the Telephone
Directory File. The Undergraduate Research Program File contains
the research interests of faculty who would like to have
undergraduates on their research teams. This file can be
searched by subject area, department, and faculty name.
3.3 Contracts and Grants File
The Contracts and Grants unit found it necessary to cease
publication of its newsletter, which announced funding
opportunities compiled from many sources. The library has
designed an electronic version in its place. The Contracts and
Grants File will be augmented with direct downloads from the
commercial Legi-Slate database, subscribed to by yet another
office on campus.
As with all of the cooperative files, the content of the file is
"owned" by the contributing unit, which is also responsible for
maintaining the file. The library provides some basic mechanisms
for the units to facilitate the updating and editing of their
files. Cooperation is becoming a watchword in the information
environment of Rensselaer.
+ Page 81 +
3.4 Office of News and Communications File
In the Fall of 1990, the Office of News and Communications file
will make available the full text of all Rensselaer Polytechnic
Institute's press releases. This file will be searchable by any
word in the text as well as by a standardized list of units,
departments, and schools within the university. It is
anticipated that some local newsrooms will choose to obtain their
press releases by accessing InfoTrax.
4.0 Conclusion
We also have plans to provide electronic access to the
undergraduate and graduate catalogs, the student handbook, class
hour course schedule, bookstore holdings, and other similar
files.
However, for the moment, we will be concentrating on mounting the
first campus-wide link ever installed by United Press
International. Our agreement with UPI provides all of their
national, international, business, finance, and sports news
simultaneous with their broadcast to newsrooms and other
commercial customers around the world. We are excited about
developing a SPIRES program allowing users to design their own
"newspapers." As with all of our files, UPI will be on the
campus mainframe and available throughout the campus on the
variety of networks supported at Rensselaer.
I mentioned that cooperation was a key word at Rensselaer. There
is another term that is important to the design team--fun. The
group truly enjoys the process of design and has yet to find a
challenge it cannot handle. We intend to keep looking!
+ Page 82 +
About the Author
Pat Molholt, Associate Director of Libraries
Folsom Library
Rensselaer Polytechnic Institute
Troy, NY 12180-3590
(518) 276-8300
Pat Molholt has been responsible for Rensselaer Libraries'
automation since 1978. In addition to her library duties, she is
a doctoral student in artificial intelligence and lexicography.
She is co-editor of the newly released work, Beyond the Book:
Extending MARC for Subject Access.
-----------------------------------------------------------------
The Public-Access Computer Systems Review is an electronic
journal. It is sent free of charge to participants of the
Public-Access Computer Systems Forum (PACS-L), a computer
conference on BITNET. To join PACS-L, send an electronic mail
message to LISTSERV@UHUPVM1 that says: SUBSCRIBE PACS-L First
Name Last Name.
This article is Copyright (C) 1990 by Pat Molholt. All
Rights Reserved.
The Public-Access Computer Systems Review is Copyright (C) 1990
by the University Libraries, University of Houston. All Rights
Reserved.
Copying is permitted for noncommercial use by computer
conferences, individual scholars, and libraries. Libraries are
authorized to add the journal to their collection, in electronic
or printed form, at no charge. This message must appear on all
copied material. All commercial use requires permission.
----------------------------------------------------------------
+ Page 44 +
-----------------------------------------------------------------
Parker, Bo. "An Overview of SPIRES and the SPIRES Consortium."
The Public-Access Computer Systems Review 1, No. 3 (1990): 44-50.
-----------------------------------------------------------------
1.0 Introduction
SPIRES is the Stanford Public Information REtrieval System, a
sophisticated information retrieval and database management
system. It has been used at Stanford and over forty other
research centers and academic institutions within the SPIRES
Consortium for more than 15 years. Applications that have been
written in SPIRES range from library catalogs to electronic
messaging systems. It is the principle database management
system in use on the central computer system at Stanford for
research, instruction, and administration.
2.0 The SPIRES Consortium
Written and developed initially at Stanford University, SPIRES
has subsequently been licensed for use at over 40 other
university, research, and government institutions. Together with
Stanford, these institutions comprise the SPIRES Consortium, a
non-profit association created expressly for the maintenance and
development of the SPIRES software, consulting and installation
support, user forums, and training and instruction. Membership
in the Consortium provides access to SPIRES--a tool comparable in
power to database management systems costing over 10 times more
than the membership fee--and access to shared applications from
the members.
Sharing is one of the great success stories in the Consortium.
For example, Memorial University of Newfoundland is in the final
stages of creating an integrated library system, which
incorporates modules borrowed from Stanford University, Princeton
University, and Rensselaer Polytechnic Institute. Memorial was
up and running with an online catalog only six months after
joining the Consortium. A circulation module obtained from RPI
was added later; the modular nature of SPIRES made it easy for
Memorial to modify the OPAC to include circulation information.
+ Page 45 +
3.0 Capabilities of the SPIRES Software
SPIRES is a general-purpose DBMS. You may create flat files,
hierarchical files, relational, or network files. You may
retrieve information sequentially, or through an index. You may
store information in any form, and enter or display it in a
different form.
SPIRES is flexible. You may build into your files the
restrictions that different groups of users must follow. You may
define many different views (schemas) for your files, either for
convenience or security. The users of your files may change or
refine their view of the file at their convenience, within the
bounds of the restrictions you place on the file.
SPIRES is integrated. You define and create your database
interactively, without intervention from a database
administrator. You define the user views and user dialogs for
your files through SPIRES, not through COBOL or PL/I programs.
You create sophisticated reports from any of your files with the
SPIRES report writer, and refine them interactively. You set up
a full-screen dialog for data input and inquiry with the SPIRES
screen definer.
4.0 Creating Databases Using SPIRES
A file definition describes each SPIRES file. The definition
divides the database into different logical record types, and
names the elements in each record. (An element to SPIRES is a
field in a record to other systems.) Elements are known by their
names to SPIRES, not by position or cryptic mnemonic.
The definition for each element also includes its relationship
with other elements; the encoding, decoding, and validation to be
performed on its contents; and any restrictions on who may see,
search, and update the element. This sort of record is called a
goal record, since it is often the goal of a search.
+ Page 46 +
You can also use SPIRES to define index records. These records
have the same general form as goal records, but contain
information that SPIRES extracts from elements in the goal record
that you have designated. SPIRES can use these records to locate
goal records very efficiently, usually with as few as five
records read from disk to retrieve a goal record from a seven
million record file. Moreover, since index records have the same
form as goal records, you can treat them as such, and examine and
manipulate data in them. An index record can even be another
goal record in the file, allowing you to build relationships
between different files.
SPIRES can set up simple databases with little more information
than the names of the elements. You can exercise complete
control over the level of detail contained in the file
definition. You need only learn as much of it as you need to fit
the complexity of your application. The entire process is
interactive; you can define, test, refine, and implement a simple
database in less than an hour. Once your database is loaded, you
can still make many changes to it. You can add additional
elements, change or add validation rules, or add or remove
indices, all without reloading the data.
5.0 Entry and Display of Information
You control the entry and display of information in SPIRES with
the FORMATS language. Formats give you flexible control over the
form of your input and output, and are used to provide or enforce
different user views of your file. Some sophisticated "system
formats" can be used with any file to give you this flexibility,
with little or no time invested in design and implementation.
Some examples are the SPIRES report writer, the prompting input
format, and the screen definer.
Your input can be in SPIRES standard format, columnar format, or
free form. Your input may come from disk or tape files, from a
line-by-line prompt at your terminal, or from a full-screen menu
on a display terminal. Special tools are available for building
extremely large databases quickly and efficiently in batch mode.
You may arrange your output in any form for a printed report, a
disk or tape file, a display on a full-screen terminal, or as
input to another processor (e.g., SCRIPT or SAS).
+ Page 47 +
6.0 User Interfaces
SPIRES provides four user-interface environments: (1) the native
SPIRES command language; (2) the Prism environment for
transaction processing, searching and report writing; (3) the
Folio environment for search, browse, and display of textual
data; and (4) Remote SPIRES for access to SPIRES databases over
networks such as BITNET or Internet.
The rich SPIRES native command language is made up of English
words, such as SELECT, FIND, SHOW, and EXPLAIN. The database
owner and the end user alike use these commands to: (1) select a
database and have its contents and organization explained; (2)
search a database, either using an index or sequentially;
(3) display records retrieved by a search; (4) choose among
input, output, and report formats; (5) create, update, or delete
records in the database; and (6) ask for online assistance with
HELP, EXPLAIN, and TUTORIAL commands.
Either the database owner or the end user can tailor a particular
application. A procedural language is provided so that a
packaged set of SPIRES commands can implement new, higher-level
commands for the user, or carry on a dialog with the user and
issue SPIRES commands to carry out the requests.
The Prism and Folio environments are designed to allow end-user
access to applications without heavy investments in training.
Both environments have rich built-in help facilities, options for
guided (inexperienced user) versus command (experienced user)
modes, and the ability to chain a series of commands together to
bypass screens.
6.1 Prism
Prism is a full-screen application support tool designed for
major transaction processing applications. Examples of how Prism
is used at Stanford includes the following applications.
o NSI (Network for Student Information)
Users may look up course, classroom, and student information for
their own department. Selected course and classroom information
may also be entered into Prism.
+ Page 48 +
o SMAS (Salary Management Administrative System)
Authorized staff may look up salaries and other job-related
information, or enter proposed salary information and produce
salary setting reports.
o SNAP (Stanford Network for Accounting and Purchasing)
In SNAP files, users may enter purchase requisitions
electronically (rather than on paper), look up requisition and
payment status information, and look up vendor information.
o SUFIN (Stanford University Financial Information Network)
The SUFIN files provide a variety of reporting functions for
university accounts and expenditure data.
6.2 Folio
Folio is the backbone of the online public access catalog in the
Stanford University Libraries, where over two million volumes
have been added to the library holdings database.
Folio is also used to provide public access to general interest
applications like JOBS (job openings at Stanford), HOUSING
(available housing in the local community), ODYSSEY (research
opportunities for students), and special bibliographies like
TECHNICAL REPORTS and the MARTIN LUTHER KING BIBLIOGRAPHY.
Folio is simple enough for first-time users to walk up to public
terminals and successfully complete searches and comprehensive
enough to support downloading of data to workstations.
6.3 Remote SPIRES
Remote SPIRES is being used at various universities to make local
databases accessible to individuals at other institutions without
requiring logon to the local host.
For example, the HEP (High Energy Physics Preprints) database at
the Stanford Linear Accelerator is accessed by physicists from
over 100 institutions around the world. Simple, one-line mail
messages comprise the "dialog" between the remote user and the
Remote SPIRES database. Interactive messages and search results
are sent by e-mail to the user.
+ Page 49 +
7.0 Technical Information
SPIRES currently runs on IBM System/370 or plug-compatible
mainframe computers under VM/CMS (SP and XA), MVS/TSO, and the
less-well-known MVS/WYLBUR/ORVYL and MTS operating systems.
A project is currently in progress to convert SPIRES to the C
programming language. This effort will position SPIRES to
participate in the distributed, client/server environments of the
future, as well as expand the range of hardware platforms on
which SPIRES will run.
8.0 Conclusion
SPIRES is a powerful, flexible database management system that
libraries can use to build a wide variety of public-access
computer systems. In addition to its native command mode, it
provides system developers with three other user interface
tools--Prism, Folio, and Remote SPIRES.
For more information on the SPIRES Consortium contact the SPIRES
Consortium Office at 415-725-1308, or HQ.CON@STANFORD.BITNET.
+ Page 50 +
About the Author
Bo Parker
Associate General Manager
SPIRES Consortium Office
Jordan Quadrangle
Stanford University
Stanford, CA 94305-4136
BITNET: GA.SBP@STANFORD.BITNET
-----------------------------------------------------------------
The Public-Access Computer Systems Review is an electronic
journal. It is sent free of charge to participants of the
Public-Access Computer Systems Forum (PACS-L), a computer
conference on BITNET. To join PACS-L, send an electronic mail
message to LISTSERV@UHUPVM1 that says: SUBSCRIBE PACS-L First
Name Last Name.
This article is Copyright (C) 1990 by Bo Parker. All
Rights Reserved.
The Public-Access Computer Systems Review is Copyright (C) 1990
by the University Libraries, University of Houston. All Rights
Reserved.
Copying is permitted for noncommercial use by computer
conferences, individual scholars, and libraries. Libraries are
authorized to add the journal to their collection, in electronic
or printed form, at no charge. This message must appear on all
copied material. All commercial use requires permission.
----------------------------------------------------------------
+ Page 83 +
----------------------------------------------------------------
Piovesan, Walter. "Mounting a Full-Text Database Using SPIRES."
The Public-Access Computer Systems Review 1, no. 3 (1990): 83-88.
----------------------------------------------------------------
1.0 Introduction
The demand for enhanced online services has led many libraries to
provide users with access to machine-readable indexes and other
products in addition to the online catalogue. The proliferation
of networks and the merging of two heretofore separate service
bureaus--the library and computer services, has facilitated the
emergence of new partnerships providing new, improved services.
This article describes how the Library and Computer Services of
Simon Fraser University worked together to select and mount the
GROLIER ACADEMIC AMERICAN ENCYCLOPEDIA database on a mainframe
using the SPIRES system.
2.0 Database Selection
In the summer of 1986, the Vice President for Research and
Information Services at Simon Fraser University, who was
responsible for both the Library and Computing Services, called
together staff from both units. The Vice President had just
returned from the 1986 Education Conference held at Carnegie
Mellon University, and he had been impressed with the emerging
new library information systems that were being demonstrated
there. He requested that a working group be formed to
investigate what new types of databases we could provide to the
campus, such as index, encyclopedia, dictionary, and directory
databases.
As Head of the Research Data Library, I was responsible for the
collection and maintenance of machine-readable data for the
campus community. Consequently, I was asked to head the project
and to report back with a list of databases that would be
feasible to load onto the campus mainframe. The databases that
were identified as being suitable for the initial phase of the
project were CURRENT CONTENTS, ERIC, GROLIER ACADEMIC AMERICAN
ENCYCLOPEDIA, MEDLINE, and PSYCHINFO.
A working team of Wolfgang Richter, a Database Administrator from
Computing Services, and myself was formed. We were asked to load
the ERIC, GROLIER ACADEMIC AMERICAN ENCYCLOPEDIA, and PSYCHINFO
databases on the campus mainframe. All of these databases were
subsequently loaded.
+ Page 84 +
The Database Administrator had already designed a menu-driven
user interface to a number of applications on our central
mainframe: e-mail, word processing, CS Newsletter, and the exam
schedule. These services were part of EASYMTS (MTS being our
operating system). We decided that we would add an additional
level of menus--InfoServe--which would contain an array of
library-based services.
3.0 Selection of SPIRES
Prior to ordering the Grolier database, we contacted Nancy Evans
of Carnegie Mellon University, who provided some key bits of
information on how they had approached the task of loading the
Grolier database into their STAIRS system. The main point that
Ms. Evans stressed was the need for full-text indexing.
The Database Manager and myself then met to decide on which of
the two campus database management systems--SPIRES or ORACLE-- we
would choose to load the Grolier database into. After an
examination of the pros and cons of each system, we settled on
SPIRES. The main reasons for this decision were that SPIRES had:
(1) the ability to easily index on individual words; (2) high-
performance characteristics; (3) superior and flexible report
generation capabilities; (4) the ability to easily handle large
data files; and (5) superiority in handling multiple users on our
IBM mainframe computer.
4.0 Characteristics of the Grolier Database
The GROLIER ACADEMIC AMERICAN ENCYCLOPEDIA, which is
approximately 170 megabytes in size, comes in the form of a
single file on magnetic tape. The cost of subscribing to the
database is based on size of the institution. There are
quarterly updates.
5.0 Pre-Load Activities
In late 1986, we ordered a sample copy of the GROLIER ACADEMIC
AMERICAN ENCYCLOPEDIA database. The Database Administrator
designed a SPIRES database definition (called a FILEDEF) and a
report definition (called a FORMATS definition) for displaying
search results.
The FILEDEF would allow for indexing on every word. We
realized that this would make for a lengthy process in loading
the full database, but we knew that if the product was to be
successful with users it had to be fully indexed.
+ Page 85 +
After giving a demonstration of the Grolier database, we received
approval to purchase the full database and proceed to make it
available via the expanded EASYMTS service as a part of the
InfoServe menu.
Once we started to load the full database, we had to make a minor
change to the existing FILEDEF. In the initial FILEDEF, each
item in the database corresponded to an article; however, this
proved problematic with large encyclopedia articles. The FILEDEF
was modified so that we would have smaller units of information:
paragraphs.
The database was indexed on four principal fields: (1) article
number (this is mostly useful for the database manager and is
used for checking for duplicate articles), (2) article name
(3) text type (e.g., bibliographic, tables, and see also
references), and (4) word (this is every word in the
encyclopedia, excluding the common words like "as," "is," and
"to").
6.0 Loading the Database
To ensure that any database errors were identified prior to
loading the database into SPIRES, the Database Administrator
wrote a series of utility programs. The programs scan the data
on tape to ensure that: (1) all the fields are present, (2)
fields are properly delineated, (3) there are no duplicate
article numbers and that numbers be of the correct length, and
(4) the information is the proper sequence as specified by the
vendor. (Interested SPIRES users can contact the author to obtain
copies of these utility programs, which tend to be specific to
the MTS operating system.)
There were some initial problems with the database, such as
errors in format and improperly delimited fields. We were able
to easily identify the errors and correct them prior to loading.
Processing the database through our error checking programs added
a couple of extra steps to the process, but we found that the
extra time spent is well worthwhile as it saves us time in the
long run. Although we found errors during the initial database
load, the database has been very stable for the past two years.
+ Page 86 +
7.0 Processing Quarterly Updates
The quarterly updates for the Grolier database are processed as
follows.
First, we copy the tape data to disk and run the above-mentioned
checking programs, which alert us to errors that need correcting.
This checking is done via utility programs specific to our MTS
operating system.
Second, we correct any errors and run a FORTRAN program to
convert the data into the SPIRES batch-load format. This "tags"
the database for loading into SPIRES, somewhat like adding MARC
tags for loading bibliographic data into an OPAC.
Third, we batch load the data into a test subfile using the
SPIBILD program. We briefly check the data with SPIRES for
glaring errors, such as duplicate article numbers.
Fourth, we run a utility program that: (1) dumps out the data
from the test subfile, (2) checks the main database for articles
with the same name (the Grolier people do not flag updated
material as such--we have to deduce it), and (3) automatically
generates the appropriate set of SPIRES REMOVE and ADD commands
for SPIBILD.
Finally, we run an overnight job so that SPIBILD can process the
REMOVE and ADD commands generated in the previous step. We
process half of the Grolier database at one time in order to
reduce down time as much as possible. It takes approximately 3
hours of CPU time on our 3091 IBM mainframe to process half of
the database (the elapsed clock time comes to about 14 hours).
SPIRES spends most of the processing time updating the article
text index, which is based on individual words used in articles.
At the time that we update the database, we insert an edition
statement so that when users select the database they will know
how current the information in it is.
+ Page 87 +
8.0 Reactions to the Grolier Database
During our initial investigation of the products that we wanted
to offer on the InfoServe service there was some skepticism on
the part of librarians who felt that students would not be able
to properly search the databases and that the GROLIER ACADEMIC
AMERICAN ENCYCLOPEDIA would not meet the needs of university
students. After three years of using the service and hearing
from students that they really find the encyclopedia useful and
use it regularly, the librarians have come to appreciate the need
for self-serve reference information and are encouraging us to
find other products to load, such as dictionaries. There are on
average 1,200 searches per month on the GROLIER ACADEMIC
AMERICAN ENCYCLOPEDIA database.
It has also proved to be very successful with the Education
department, which uses the encyclopedia in their courses on
computers and information that they give to high school students.
These students have no problem in using the service.
9.0 Conclusion
Using the SPIRES software, Simon Fraser University has
successfully mounted the full-text GROLIER ACADEMIC AMERICAN
ENCYCLOPEDIA and other databases. The encyclopedia database has
received a warm reception from the university community, and it
has proven itself to be a valuable information resource.
+ Page 88 +
About the Author
Walter Piovesan
Head, Research Data Library
W.A.C. Bennett Library
Simon Fraser University
Burnaby, British Columbia, CANADA
BITNET: USERVINO@SFU.BITNET
Internet: walter_piovesan@cc.sfu.ca
-----------------------------------------------------------------
The Public-Access Computer Systems Review is an electronic
journal. It is sent free of charge to participants of the
Public-Access Computer Systems Forum (PACS-L), a computer
conference on BITNET. To join PACS-L, send an electronic mail
message to LISTSERV@UHUPVM1 that says: SUBSCRIBE PACS-L First
Name Last Name.
This article is Copyright (C) 1990 by Walter Piovesan. All
Rights Reserved.
The Public-Access Computer Systems Review is Copyright (C) 1990
by the University Libraries, University of Houston. All Rights
Reserved.
Copying is permitted for noncommercial use by computer
conferences, individual scholars, and libraries. Libraries are
authorized to add the journal to their collection, in electronic
or printed form, at no charge. This message must appear on all
copied material. All commercial use requires permission.
----------------------------------------------------------------
+ Page 89 +
----------------------------------------------------------------
Ritchie, Mark. "The WatMedia Project." The Public-Access
Computer Systems Review 1, No. 3 (1990): 89-95.
-----------------------------------------------------------------
1.0 Introduction
The WatMedia Project utilizes the SPIRES software to provide
users with access to information about nonprint materials in the
collections of 22 members of the Interfilm Group. The WatMedia
system is available to authorized users on BITNET and other
networks.
2.0 The Need for the WatMedia Project
The WatMedia Project was begun by the Media Library of the
University of Waterloo in 1974 in order to improve both patron
and staff access to information about non-print resources. A
brief analysis of the access problem showed that the prime area
of difficulty was in the information retrieval interface, both
between user and library and between library staff and source
collections. Media resources tend to be invisible in the sense
that they have no index or table of contents through which the
potential user may browse. The obvious answer was a catalogue of
some sort.
A detailed analysis of user statistics revealed that most
materials were being used for purposes far different than those
for which the materials were originally produced. The wide scope
of these uses was such that it persuaded us that any cataloguing
system adopted must consider and cater to these uses as well as
more traditional uses. The problem of browsing became a prime
consideration during our investigations, as the standard
bibliographic information was judged to be inadequate by our
users and the costs in time and labour for users to view many
different titles in order to find the one title applicable to
their needs was exorbitant. We decided that the system should,
in effect, create a table of contents and an index for each item,
in addition to the information found in the standard
bibliographic reference.
A cataloguing system needed to be developed to act as an access
point to the collection. In deciding on a system there were
several factors that need to be considered. These were: (1) ease
of use for faculty and students; (2) impact on the organization
of the collection; (3) impact on staffing levels in the media
library; (4) currency of information and ease of updating; (5)
comprehensiveness of entries and indexes; and (6) cost.
+ Page 90 +
Our conclusions were that an analytical catalogue should be
devised and that this new catalogue be based largely on an
existing cataloguing system if possible. The system which was
finally chosen was the one in use, at that time, by the British
National Film Archives. The Waterloo Media Cataloguing System is
based on this system, with extensive modifications to permit
efficient use in a computerized environment. However, the basic
philosophy behind the two systems is the same, only the means of
recording the data and the means of accessing it are essentially
different.
3.0 Selection of SPIRES
The next step was the choosing of an appropriate method of
retrieval of the information from the computer. The existing
retrieval methods available on our campus were primarily of the
sequential search variety. We did not want to use this method
due to the high costs involved when searching large databases.
Therefore, we endeavored to find a system using some form of
tree-structured indexing. We also determined at an early stage
that the primary access point to the system would be online and
that hardcopy catalogues would be of secondary importance. We
came to this conclusion because of the unusually high
availability of computer access at the University of Waterloo.
The results of our search indicated that the best choice was the
Stanford Public Information REtrieval System, or SPIRES for
short. SPIRES was initially chosen by the University of Waterloo
for the WatMedia project because nothing else was available that
had the potential to handle our projected requirements. One of
the main factors which influenced our decision was the ability of
SPIRES to handle large multiple indexes efficiently, something
that competing systems could not do. SPIRES also allowed the
system designer to modify the file definition for the database
without necessarily having to rebuild the whole database.
Despite this "Hobson's Choice" we have never regretted the
decision. Only now are some of SPIRES's features being
implemented by other systems, and some features, like remote
access capability, have yet to be implemented by these systems.
+ Page 91 +
4.0 The WatMedia Database
The original WatMedia database has been expanded to become a
union catalogue for the twenty-two universities, colleges and
institutes of the Interfilm Group. It also contains extensive
listings of the holdings of commercial distributors and
libraries.
The catalogue is basically a title main entry format with a
number of classed and alphabetical indexes: (1) title catalogue;
(2) subject indexes; and (3) biographic index and analytic index.
These form the permanent catalogue, but there is also preliminary
catalogue data maintained.
In the preliminary catalogue such information about an item that
can be readily obtained--accurate or inaccurate--is immediately
entered. As soon as possible the item is viewed, further
information is obtained, the existing information verified, and
the record is modified and placed in the permanent catalogue.
The rules governing entry are the same for both catalogues. In a
sense an entry is never complete. As more information on a
particular item or person becomes necessary, it is sometimes
required that records which may not have been touched for years
need to be updated. This is particularly true when persons who
may have been involved in a production in a minor role become
important as their careers develop and adjustments must be made
to update the indexes to make it possible to retrieve as complete
a filmography as possible on that person.
The fundamental difference between most other published rules and
the rules we use is the way title entries are handled. Since
nonprint materials are most commonly identified by title, we feel
that they should be entered under title. The preliminary rules
of the Library of Congress and UNESCO recommended that each
language version of an item should be entered under the title of
the version in hand, which follows the recognized procedure for
book cataloguing. However, it is felt that the title and credit
frames of nonprint items (film and video in particular) cannot be
treated with the respect traditionally accorded to the title page
of a book, since they may be in any language and subject to no
recognized principles of accuracy. Therefore we enter all
materials under the original title of release and, so far as
possible, in the language of origin. This principle has also
been adopted by the Aslib committee and is recognized by the
International Federation of Film Archives (FIAF).
+ Page 92 +
In order to have a system which is largely compatible with a
recognized international standard, these rules have been
developed from those used by the British Film Institute's
National Film Archives, which have been adopted by many other
national film archives around the world. Philosophically, our
rules remain substantially unaltered from the original British
rules, but they still cannot be considered as definitive.
Further revision may be necessary as new technical developments
appear.
Since the first prototype version was produced in 1975, many
procedures that were originally designed for manual systems have
been rethought for the computer's online environment.
Discussions with librarians and archivists in some 23 countries
have resulted in a major change in the handling of items. Title
main entries are still used; however, instead of making separate
entries for each copy of each title or version of a title, we
make a generic entry covering the original version, with separate
collations for each copy of each version in the same record.
It should be noted that WatMedia was designed for a university
academic and research library situation and, as such, is much
more elaborate and comprehensive than any such system required by
a public library or lower school application.
5.0 Searching WatMedia
To search the WatMedia database, the user sends a "find" command
to the system. The basic syntax of this command is:
find [index name] [value]
For example, to find the film Citizen Kane, the user would enter:
find title Citizen Kane
+ Page 93 +
Table 1 shows the basic indexes that are available.
-----------------------------------------------------------------
Table 1. Some Selected Indexes
-----------------------------------------------------------------
Index Valid Index Names
Title T, TI, TIT, TITL, TITLE
Subject SB, SUB, SUBJ, SUBJECT (Synonym)
Dewey Decimal DC, DCL, DCLASS
Person NAME, PERSON
Place COUNTRY, PL, PLACE
Distributor D, DIST, DISTR, DISTRIBUTOR
Sponsor SP, SPON, SPONSOR
Audience AUDIENCE, LEVEL, TARGET
Language LANG, LANGUAGE
-----------------------------------------------------------------
The system has a rich assortment of other searching capabilities;
however, this topic beyond the scope of the current article.
Users can send the "manual" command to retrieve WatMedia's user's
guide.
6.0 Access to WatMedia via Remote SPIRES
A relatively recent addition to SPIRES is a utility called Remote
SPIRES. This tool was originally a set of CMS execs and XEDIT
macros; however, once its viability had been demonstrated, it was
recoded in SPIRES' own procedural language.
From a developer's perspective, Remote SPIRES is relatively
straightforward to implement. Any application can be installed
as a remotely accessible database. The inquiry language used for
remote queries is the same as the inquiry language used for local
queries, so anyone familiar with SPIRES should be able to use a
remote application with little or no training. Those not already
familiar with SPIRES should be pleased to see that the inquiry
language is not terribly arcane.
Remote SPIRES implementers are encouraged to support at least the
two standard views of the data, brief and full. If this suggested
standard is adhered to (as WatMedia does), a user familiar with
one Remote SPIRES application can easily use a different
application.
+ Page 94 +
For a SPIRES application to become remotely accessible, it is
necessary that it be installed in a server. WatMedia currently
has a dedicated server, named appropriately enough, "WatMedia."
While it is possible for the system administrator to authorize
all users at all nodes, we do not do that. Such authorization
disables the transaction accounting features of the remote server
and, at this point in time, we wish to have these statistics.
Using wildcards, everyone at a given network node can be
authorized, but this limitation does mean that anyone seeking
access must first have their machine made known to the server.
Anyone needing Remote SPIRES access to WatMedia can contact the
author for authorization.
Authorized users with an account on a computer connected to
BITNET can search WatMedia either with interactive messages or e-
mail messages. Users on other networks can use e-mail messages.
If the user sends an e-mail message, the message should only
contain a one-line command, such as "find title Blade Runner."
It should be noted that the server's default method of returning
information to the requester is via an e-mail message.
7.0 Conclusion
The WatMedia system provides its users with dramatically improved
access to information about the nonprint holdings of the members
of the Interfilm Group. The system was developed using the
SPIRES software, and this software has proven itself capable of
meeting the evolving software development needs of the WatMedia
Project.
+ Page 95 +
About the Author
Mark Ritchie
University of Waterloo Library
200 University Ave. W
Waterloo, Ontario N2L 3G1
Canada
(519) 888-4070
BITNET: avfilm@watdcs.UWaterloo.ca
-----------------------------------------------------------------
The Public-Access Computer Systems Review is an electronic
journal. It is sent free of charge to participants of the
Public-Access Computer Systems Forum (PACS-L), a computer
conference on BITNET. To join PACS-L, send an electronic mail
message to LISTSERV@UHUPVM1 that says: SUBSCRIBE PACS-L First
Name Last Name.
This article is Copyright (C) 1990 by Mark Ritchie. All
Rights Reserved.
The Public-Access Computer Systems Review is Copyright (C) 1990
by the University Libraries, University of Houston. All Rights
Reserved.
Copying is permitted for noncommercial use by computer
conferences, individual scholars, and libraries. Libraries are
authorized to add the journal to their collection, in electronic
or printed form, at no charge. This message must appear on all
copied material. All commercial use requires permission.
----------------------------------------------------------------
+ Page 4 +
-----------------------------------------------------------------
Troll, Denise A. "Library Information System II: Progress Report
and Technical Plan." The Public-Access Computer Systems Review
1, No. 3 (1990): 4-29.
-----------------------------------------------------------------
-----------------------------------------------------------------
Note from the Editor:
This article has been condensed from a Carnegie Mellon University
Libraries technical report--Library Information System II:
Progress Report and Technical Plan, Mercury Technical Report
Series, Number 3. To obtain a copy of the full printed report,
send a check for $5 to: Mercury Documents Coordinator,
Administrative Offices, Carnegie Mellon University Libraries,
Frew Street, Pittsburgh, PA 15213.
-----------------------------------------------------------------
Abstract
This article describes the work at Carnegie Mellon University in
library automation and information retrieval systems. Specific
projects include: broadening the range of electronic
bibliographic resources by adding databases and expanding the
range of stand-alone CD-ROM databases; deepening access to book
resources by enhancing catalog records, and adding contents
information for scientific and technical proceedings and book
reviews to the online catalog; designing a new library
information system (LIS II) on a hardware and software platform
that demonstrates the feasibility of distributed library systems
running on UNIX workstations; and building image databases for
the delivery of full-text documents.
The Library Information System II provides for retrieval from
several DEC VAX servers using Z39.50 layered on TCP/IP, a search
engine from OCLC called Newton, a pilot user interface in OSF
X.11 Motif, and an authentication system based on Kerberos and
Hesiod developed at MIT. The system is being built to existing
and proposed standards, and it is designed to be machine
independent. A system which distributes databases over a number
of file servers will thus be affordable to a wide range of
libraries.
This article address a number of technical and design issues and
concludes with an outline of the research and development agenda
for the coming year.
+ Page 5 +
1.0 Background
In 1988, Carnegie Mellon proposed building the Library
Information System II, a state-of-the-art electronic library
capable of delivering a broad range of bibliographic and textual
information to students and scholars. LIS II would be a second-
generation system of the highly successful Library Information
System currently in place in the University Libraries.
In addition to support from the Pew Memorial Trust, the LIS II
project also receives support from the Digital Equipment
Corporation, the American Association for Artificial
Intelligence, the Online Computer Library Center (OCLC), and
Carnegie Mellon University.
1.1 General Goals
The four major goals for LIS II are: (1) expand the breadth and
depth of library information available over the campus network,
focusing first on expanded coverage of bibliographic information
and later on the delivery of the full text of documents; (2) to
provide more information about the contents of books by indexing
and retrieving the table of contents; (3) to use the capabilities
of advanced workstations to improve retrieval, interfaces, and
reduce the cost of a large scale retrieval system; and (4) to
document and disseminate the results of our work so that if we
are successful, our innovations can be diffused within academia.
This report discusses progress toward each of these goals.
1.2 General Architecture
Moving information retrieval from a mainframe computer to
multiple server machines requires considerable planning and
changes in hardware and software. A special computer will be
used to build LIS II databases, and special machines will be used
as database or retrieval servers. All computers on the campus
network or with access to the campus network will have access to
LIS II. Workstations and X Windows terminals in the University
Libraries and workstations in offices and public computing
clusters on campus will run the graphical interface currently
being built. Users of other personal computers, like the IBM PC
and Apple Macintosh, will run a terminal interface similar to the
current LIS I interface.
+ Page 6 +
2.0 Improving Electronic Resources
As the new hardware and software platform is being designed and
developed, we are making significant improvements in our
electronic resources. We are expanding resources in the existing
Library Information System, adding stand-alone databases on CD-
ROM, and providing more information about the contents of books
we acquire.
To expand the breadth of our electronic collection, we have
purchased databases from commercial vendors, and are exploring
the production of databases from local resources. We are also
negotiating with publishers to acquire machine-readable journals
and technical reports. To expand the depth of our collection, we
have designed and implemented several projects to enhance our
catalog records for books and technical reports. Each of these
developments is discussed briefly below. Whenever possible,
additions to the collection are made available to campus as
quickly as possible through the current Library Information
System, LIS I, so that usage and impact can be monitored and thus
contribute to the design of LIS II.
2.1 Expanding the Breadth of the Electronic Collection
We have broadened the scope of our electronic collection by
purchasing commercial databases, by acquiring machine-readable
text to be mounted locally as databases, and by designing a
system architecture that will facilitate the integration of
locally produced databases, e.g., Carnegie Mellon administrative
databases, into LIS II.
+ Page 7 +
2.1.1 Commercial Databases
To make the best use of our human resources, while developing the
distributed retrieval architecture detailed in Section 3, we
limited the addition of commercial databases available through
the Library Information System to those needed for user tests and
planning. We purchased INSPEC (Information Services for Physics,
Electronics, and Computing), 1987-present, on magnetic tape and
released it to campus (LIS I) in November 1989. INSPEC
corresponds to four printed publications: Physics Abstracts,
Electrical and Electronic Abstracts, Computer and Control
Abstracts, and Update on Information Technology (IT Focus).
INSPEC was well received by the physics and engineering
communities at Carnegie Mellon. More than 1,700 searches were
conducted in this database in May 1990, with an average of 1,900
searches per month since January. Transaction logs of INSPEC
searches were used to construct a model of how users search a
large, complex database (see Section 3.1.1.4 "Search Complexity
and Performance" for details).
In the interest of immediate improvements in resource
availability and recognizing that not all databases need to be
online on the campus network, we expanded our electronic
resources by acquiring a number of CD-ROM products. Eventually
we want to provide network access to CD-ROM databases, with the
delivery mechanism transparent to the user. The following CD-
ROMs have been added to the University Libraries' collection
since July 1988.
+ Page 8 +
-----------------------------------------------------------------
Table 1. CD-ROM Databases Added Since July 1988
-----------------------------------------------------------------
CIRR (May 1990)
Bibliographic citations and abstracts of company and industry
research reports provided by securities and investment firms.
Art Index (April 1990)
Bibliographic citations of journal articles, yearbooks, and
museum bulletins in all areas of art.
Compact Disclosure (April 1990)
Financial and management information on public companies.
COMPENDEX (April 1990)
Citations of articles, conference papers, and monographs in all
aspects of engineering and related areas.
PAIS (April 1990)
Bibliographic citations of journal articles, books, and
government documents in public affairs.
COMPUTSTAT (March 1990)
Financial and statistical information on public companies.
CD-MARC (October 1989)
Library of Congress subject authority file and subject headings.
MathSci (October 1989)
Reviews and citations of the world's research literature in
mathematics and related areas.
NTIS (September 1989)
Bibliographic citations and abstracts of government-sponsored
research and development reports.
-----------------------------------------------------------------
+ Page 9 +
The following CD-ROMs are also available.
-----------------------------------------------------------------
Table 2. Other CD-ROM Databases
-----------------------------------------------------------------
CIS Masterfile (Test Copy)
Bibliographic citations and abstracts of congressional
publications.
Statistical Masterfile (Test Copy)
Bibliographic citations and abstracts of statistical information
from various publishers.
Social Science Citation Index (Test Copy)
Bibliographic citations of journal articles in the social
sciences.
PsycLit (March 1988)
Journal article citations and abstracts in all areas of
psychology.
ABI/Inform (January 1988)
Journal article citations and abstracts on business.
Dissertation Abstracts OnDisc (August 1987)
Bibliographic citations and abstracts of dissertations in all
subject areas.
Books In Print Plus (July 1987)
Bibliographic citations of books (in print and forthcoming) in
all subject areas.
ERIC (July 1987)
Bibliographic citations and abstracts of journal articles and
research reports in education.
-----------------------------------------------------------------
+ Page 10 +
2.1.2 Machine-Readable Text
In preparation to begin experiments with the delivery of full-
text documents, we are acquiring machine-readable journals and
technical reports in the subject field of computer science. We
have negotiated with several leading publishers to include their
materials online. Elsevier, Pergamon, and the Association of
Computing Machinery (ACM) are willing to give us access to their
materials. The ACM has committed to providing machine-readable
versions of four of its publications: Computing Reviews (10
years), Collected Algorithms (25 years), Communications (2
years), and Guide to Computing Literature (10 years). We have
been approached by the Institution of Electrical and Electronics
Engineers (IEEE) to provide storage and access to their entire
collection of journal page images, over 30 CD-ROMs per year,
indexed through INSPEC, and are working on electronic publishing
with the American Association for Artificial Intelligence (AAAI).
In addition, we are working with MIT, Stanford University,
University of Illinois, and the University of California to
collect machine-readable computer science technical reports (see
Section 3.4 "Developing Standards and Sharing Resources" for
details). These materials will be mounted locally as databases.
2.1.3 Local Databases
The success of the Library Information System (LIS I) has
stimulated the demand for more online access to campus
information. In response to this need, the University Libraries
have set the goal of becoming a general electronic publisher for
Carnegie Mellon. We intend to provide online full-text databases
of campus information and online ordering of specific services
(e.g., ordering textbooks or audio-visual equipment and putting
books on reserve) to create an infrastructure for improving
support for instruction in the University. As a first step in
this direction, we mounted the Faculty/Staff Directory and the C-
Book (the student directory) as a database called Who's Who at
CMU and released it to campus (LIS I) in February 1989; Who's Who
accounts for approximately 8-11% of all searches in LIS, ranging
from 5-8,000 searches per month, during the academic year. Plans
to mount additional full-text databases are discussed in Section
3.2.2 "Full-Text Databases."
+ Page 11 +
2.2 Expanding the Depth of the Electronic Record
Bibliographic records, originally designed for card catalog use,
continue to be the primary access to book collections for users
of online catalogs. However, research indicates that the new
technology has changed information-seeking behavior, with the
result that users are essentially using new search strategies
with old information structures. For example, users do more
subject searching in online catalogs than they did in card
catalogs, and are finding the information in bibliographic
records inadequate to their needs--it is often insufficient to
retrieve the record or to judge the book's relevance even if the
record is retrieved. According to Richard Van Orden, enriching
catalog records with information about the content of books may
be the next major improvement in information retrieval. Enhanced
information can expedite both the remote selection of material
and document delivery. The ultimate purpose of catalog
enhancements is "the timely provision of selected full-text
materials to individuals when and where they need them." [1]
Adding information about the content of books to our online
catalog will increase the number of records retrieved and allow
users to make better judgments about the value of a book for
their particular query. University Libraries have several
projects underway to expand the depth of content information
available in the online catalog. Some record enhancements have
been done entirely in-house and released to campus in LIS I. Two
other enhancements have been acquired from commercial vendors and
implemented but not yet released to campus: book reviews from
Choice, and analytics for books and conference proceedings from
ISI (the Institute for Scientific Information).
2.2.1 In-House Catalog Enhancements
Barbara Richards, Alice Bright, and Terry Hurlbert implemented
the Online Catalog Enhancements Project in the spring of 1989.
The first stage of the project thoroughly examined sample
contents pages to determine which kinds of material and how many
of each kind should be included in an enhancement project, and to
assess the problems that might occur. Based on this review, the
cataloging staff established criteria for enhancing books using
definitions of works to be included and works to be excluded; the
criteria are discussed below. The review suggested that,
provided scientific and technical conference proceedings were
excluded, only 25-30% of the new books purchased would qualify
for adding table of contents information.
+ Page 12 +
2.2.1.1 Criteria for Enhancement of Catalog Records
o If the contents of a book can be cited separately, then the
record is enhanced. Anthologies of plays, collections of
critical essays written by different authors, and separately
authored chapter titles are three categories of enhanced books.
However, proceedings of scientific and technical conferences are
excluded from this enhancement for two reasons. First, the
length of the tables of contents may exceed a hundred titles,
requiring extensive inputting of data, and second, alternative
electronic sources, like INSPEC, can provide this information.
However, we are placing a flag in conference proceedings catalog
records to indicate that the items could be enhanced.
o If the chapter titles within a book provide valuable
information about the contents that is not already provided by
keywords in the title or subject headings, then the record is
enhanced. This category includes chapter titles that delineate
historical time periods. Books for which words in the title and
supplied subject headings already provide appropriate and
sufficient access are excluded. If no unique keywords exist in
the contents to improve the description of the monograph beyond
the standard cataloging information, then the record is not
enhanced; this decision is made by the cataloger.
o If a monograph is an exhibition catalog, then the record is
enhanced for each exhibitor whose work is included in the
exhibition, with the exception that any exhibition catalog
containing more than 25 artists is not enhanced. We are placing
a flag in records of exhibition catalogs with more than 25
artists to indicate that the items could be enhanced.
o If a Carnegie Mellon computer science or EDRC (Engineering
Design Research Center) technical report has an author-supplied
abstract less than one page in length, then the record is
enhanced by adding the abstract. If the abstract is longer than
one page, then the record is not enhanced.
+ Page 13 +
2.2.1.2 Catalog Enhancement Projects
Three enhancements projects were undertaken in-house. The first
project, the only review of existing catalog records, is a
special service for the Drama and English departments at Carnegie
Mellon, which have a great demand for plays. This project is
adding contents notes (MARC field 505) or added entries (MARC
fields 700 and 740) for plays in collections with different
authors or the same author. The project was begun by reviewing
catalog records for American and English drama; 3,857 catalog
records were reviewed and 635 works of collected plays were
enhanced. The project is continuing with review of Scandinavian,
Italian, Latin, Spanish and French drama.
The second project is adding contents notes (MARC field 505) to
the records of newly acquired books with separately authored
chapters or chapter titles with valuable keyword information (not
provided in the title or subject headings), and to art exhibition
catalogs with 25 or fewer artists. To date, 1,187 records have
been enhanced. We are flagging records that should be enhanced
but are currently not being enhanced, e.g., art exhibition
catalogs with more than 25 artists, conference proceedings, and
unanalyzed series. Enhancing recently purchased books that meet
the criteria for enhancement is an ongoing project.
The third enhancement project is adding abstracts (MARC field
520) to CMU computer science and EDRC (Engineering Design
Research Center) technical reports. To date, 1,649 of the total
1,832 technical reports cataloged have been enhanced. The
technical reports that were cataloged but not enhanced either had
no abstract or the abstract exceeded one printed page in length.
The Online Catalog Enhancements Project has enhanced a total of
3,471 catalog records since October of 1989. Though the project
is ongoing, a sufficient number of records have been enhanced and
made available online in LIS I to begin studying the effects of
these records on retrieval and browsing, i.e., on users' access
to information and their ability to discriminate between relevant
and irrelevant information. We are collaborating with OCLC to
investigate the effects of these catalog enhancements (see
Section 4.3 "Research Plans" for a brief overview of our plans).
+ Page 14 +
2.2.1.3 Sharing Enhanced Catalog Records
At the present time, the contents information input by Carnegie
Mellon Library staff is only useful to our clientele. The
enhanced records created in this project, although created on the
OCLC system, are not available to other libraries. Discussions
led by Tom Michalak at the February and May 1990 Users Council
meetings at OCLC suggest that while many libraries are interested
in the potential of enhanced catalog records, support for
including records "enhanced" with contents information in the
OCLC cataloging system is not yet widespread. However, it seems
reasonable that OCLC should allow the contents information input
by member libraries to be made available to other libraries who
may wish to add such in formation to their catalog records.
Unquestionably there will be technical problems which will have
to be solved if libraries are to share enhanced records, and
Carnegie Mellon will continue to raise the issue of sharing
enhanced records in national databases.
2.2.2 Commercial Catalog Enhancements
Though our in-house record enhancement projects address certain
information needs, technical and financial constraints limit what
we can do in-house. For example, works with several hundred
author-title entries, like conference proceedings, are too costly
for an individual library to catalog and the resulting records
with contents notes are too large for current systems to handle.
One alternative is to purchase analytic records for these items
from a commercial vendor and merge these with the Library
Catalog. We have two projects of this type underway; the effects
of these enhancements will be evaluated along with the in-house
record enhancements (see Section 4.3).
+ Page 15 +
2.2.2.1 CHOICE Catalog Enhancements
Choice is a basic book reviewing service for academic and public
libraries, emphasizing scholarly titles in their reviews. Choice
reviews are available in machine-readable form. Current plans
are to make selected records from the Choice database,
specifically those that review books in our collection,
searchable in our Library Catalog. These records will be
searchable along with the catalog records, so that a search for a
book title, for example, will retrieve two records--the catalog
record for the book and the Choice record with the book review.
We modified the Choice records slightly for inclusion in the
Catalog. For example, we removed the prices for hardback and
paperback purchases, and appended the HOLDINGS field from the
Library Catalog record for the book being reviewed to the Choice
record reviewing that book. At present, we estimate the addition
of 4,000 records to the Catalog using this enhancement for the
past three years of the Choice database.
The decision to provide searchable book review records, rather
than a hypertext link between the Catalog and Choice records that
could be traversed once the bibliographic record was displayed,
was a conscious one; its impact on retrieval will have to be
measured. We assume that searching book review records with
catalog records will facilitate recall of materials, but we do
not know if it will facilitate precision and relevance judgments.
We will do a cost-benefit analysis after releasing the Choice
records to campus. Perhaps later, as an additional test of
usage, we will release the entire Choice database as a separate
database in LIS II.
2.2.2.2 ISI Catalog Enhancements
Similar to the Choice enhancement project, we plan to include
selected ISI (Institute for Scientific Information) analytic and
full records, for books and conference proceedings in science and
engineering, in the Library Catalog. Again, we appended the
HOLDINGS field from the Library Catalog record for the item
indexed in ISI to the ISI record for that item. The analytic
records will be searchable, and have a hypertext link to the
associated full record with table of contents, which will be
displayable from any analytic record. In contrast to the Choice
project, where the review record was searchable along with the
catalog records, we chose not to make the ISI full table of
contents records searchable because all of the information they
contain is available in the individual analytic records. We
estimate the addition of 15,000 analytic records to the Library
Catalog using this enhancement, indexing approximately 1,000
scientific and technical conference proceedings.
+ Page 16 +
3.0 Retrieval System Development
The technical goal of LIS II is to produce an affordable library
information system for networked campuses, which are evolving
across the nation and the world. Realistically, if libraries are
to deliver documents to scholars at their desks, the storage,
retrieval, and delivery of information must be cost effective.
Furthermore, if libraries are to share electronic resources like
enhanced records, we need a communication protocol that supports
shared access to information. The goal is to build, not an
experimental system, but a hardware and software platform that
demonstrates the affordability and usability of the system for
campuses of any size. Success depends on establishing standards.
The LIS II development team is committed to using established
standards, and when development mandates changing or extending
standards, to do so within the proper forum for implementing such
standards. See Section 3.4 "Developing Standards and Sharing
Resources" for details.
LIS II is based on the Andrew system at Carnegie Mellon,
developed in a partnership with IBM. Named for both Andrew
Carnegie and Andrew Mellon, Andrew encompasses the campus
network--in reality a network of more than fifty local area
networks, a distributed file system with hundreds of file
servers, and thousands of high-function workstations.
Workstations facilitate working with multiple applications by
providing a window for each application and a window manager to
manipulate the application windows, which can be tiled or stacked
to produce a two- or three-dimensional workspace, or iconified
(shrunk to a graphic) to clear the electronic desktop. These
features provide a common user interface to network services,
including electronic mail and bulletin boards, printing, and
access to the Library Information System. Users can also access
the Internet from Andrew, extending their research and
collaborative efforts beyond Carnegie Mellon.
+ Page 17 +
The Open Software Foundation (OSF), a non-profit research and
development company sponsored by many of the world's major
computer firms, recently incorporated the Andrew File System
(AFS) into its Distributed Computing Environment (DCE),
indicating the acceptance of AFS as a distributed file system
standard. OSF distributes a software toolkit and interface style
guide that, packaged with the mwm (X.11) window manager, comprise
the graphical user interface standard called Motif. Motif has
achieved wide acceptance as a standard among hardware and
software vendors, and the body of applications implemented with
the Motif toolkit, running under mwm, and conforming to the Motif
style specifications is growing. Carnegie Mellon has adopted
Motif as the campus standard, and the Motif Window Manager (mwm)
will be the default window manager for workstations in the Fall
1990. The LIS II development team has adopted Motif as the
library standard for user interface design. The result will be a
single interface that brings together local applications and
services with new third-party software, running across a wide
range of machines.
The following two paragraphs provide an overview of our current
status and future plans. The rationale and details of each phase
of the project are discussed in the sections that follow.
To date, we have created a reasonable model for libraries to
share resources under a common interface and demonstrated that
the OSI Z39.50 protocol can work across separate servers. The
Z39.50 information retrieval protocol allows an application on
one computer to query a database on another computer; it
specifies procedures and structures for submitting searches,
transmitting database records, and access and resource control.
An alpha version of basic software components for LIS II was
demonstrated at the EDUCOM conference in October 1989. This
demonstration included retrieval from several servers across the
NSFnet using Z39.50 layered on TCP/IP, a new retrieval system
from OCLC called Newton, and a pilot user interface for
workstations written in DecWindows. Since then we have added a
generalized authentication scheme based on the Kerberos system,
converted the user interface to OSF X.11/Motif, and begun name
service using Hesiod.
+ Page 18 +
Meanwhile, work has continued on the next phase of the project.
By the 1990 EDUCOM conference, we hope to be able to demonstrate
storage, retrieval and display of bitmapped images using Fax
Group 4 formats. The first work with compound documents, using
SGML (Standard Generalized Markup Language) and CDA (Compound
Document Architecture), will follow shortly thereafter. During
the next year, we will implement LIS II on a new generation of
small RISC servers supported by major vendors. This will bring
the price of a minimal campus retrieval system to below $100,000,
which is considerably less than the cost of running information
retrieval on a mainframe. The same technology can be extended to
CD-ROM if CD-ROM producers accept networking standards. Though
many vendors are still reluctant to support standards, and
licensing restrictions limit networking, we expect to integrate
some of our CD-ROM databases into campus networking by the end of
1991. Future work also includes the development of a simple user
interface for other personal computers, and a method of
statistically monitoring usage.
3.1 User Interface Design
A quality user interface is critical to the success of LIS II.
Quality storage, indexing, and retrieval will only enable users
to access the breadth and depth of our electronic collection if
the user interface supports the tasks they want to do. This
phase of LIS II development focuses on building a single
workstation interface following OSF's Motif Style Guide.
Developing a graphical interface for workstations using the Motif
toolkit enables us to overcome some of the problems in interface
design encountered with LIS I. For example, users sometimes lost
their context when they were working with the VT100 display of
LIS I--the only interface available, which responded to each user
action by displaying a panel that replaced the panel that
prompted the action. Motif offers multiple windows, one for each
conceptual task, enabling users to keep their context and build a
better conceptual model of information search and retrieval
online.
+ Page 19 +
3.1.1 User Studies
In conjunction with implementing a dynamic user interface in
Motif, we have analyzed transaction logs, done protocol studies,
and conducted lengthy interviews in a wide range of research
areas to understand the human factors involved in online
information retrieval. The remainder of this section discusses
several of these projects, specifically the requirements for
journal information, the sequence in which information fields are
displayed, the problem of library jargon, and search complexity
and performance. Plans for future user studies are included in
Section 4, "Research and Development Agenda."
3.1.1.1 Requirements for Journal Information
We have spent considerable time exploring the special
requirements of journal and conference information. Journals and
conference proceedings often have long titles that, when
truncated for the one-line-per-record display, become
meaningless, e.g., "International Journal of." Furthermore,
journal titles often change over time as the journal is re-named
to better identify its contents in a changing discipline or to
reflect a merger with another publication; these name changes
create considerable problems for users and are difficult to track
in systems without cross referencing and linked records.
Indications of journal holdings are also problematic because
subscriptions are sometimes intermittent, issues are sometimes
missing, and information about the most recent issue is often not
entered into the system in a timely way. Additionally, since our
journals are for the most part shelved alphabetically by main
entry (which is not necessarily the same as the title) rather
than by assigned call number, users often have trouble locating
the journal even when they know we have it in our collection. To
complicate matters still further, LIS I transaction logs indicate
that users clearly want to search the contents of journals for
author, title, and subject information, not just search a
database of journal records to see if we have a journal.
+ Page 20 +
The results of our research on journals to date indicate that LIS
II should provide the following:
o a one-line-per-record display that includes meaningful
(usable) information
o a brief record display that includes variant journal titles
and Carnegie Mellon holdings
o a full record display
o an item- or issue-level display that includes real-time
updates of latest issues
o a table of contents display accessible from the issue-level
display
o a simple way to track journal title changes
o a display for browsing variations of journal titles
o links between records in other databases (e.g., INSPEC) and
associated journal records
o a simple way to request a photocopy or FAX or to submit an
interlibrary loan request
3.1.1.2 Sequence of Displayed Fields
Since many database records are several screens long (in LIS I)
and research indicates that users often do not display more than
the first screen, the sequence in which information fields are
displayed is very important to user satisfaction. Traditionally,
our catalog records have displayed information in a sequence
suitable for librarians or system designers, but not necessarily
suitable for patrons of the electronic library. For example,
esoteric information fields like 008, CODES, ACQNUM, and DOCNUM
are displayed at the top of the record, while the information
fields that users typically use are displayed farther down the
record, often interspersed with more esoteric or less important
fields, e.g., LC-CARD and LANGUAGE (usually English). This
sequence results in users having to scan the full records for
relevant information, which may be displayed on subsequent
screens. Our goal is to reorganize the sequence of fields so
that those typically used by library patrons are at the top of
the record and thus appear on the first screen when the full
record is displayed.
+ Page 21 +
3.1.1.3 Library Jargon
Another study examined jargon in library handouts and reference
interviews (in preparation for online searching). The results of
the study reveal that patrons misunderstand library terms
approximately half of the time. The implications for LIS II are
far reaching, not only in terms of the language to be used in the
online help and on the buttons and menus, but in terms of what
tags or labels to attach to the different information fields in
the records themselves. For example, in a multiple choice test,
only 35 out of 100 test subjects (CMU freshmen) selected the
correct definition for the term "citation"; most subjects drew on
their knowledge of parking or speeding violations and defined
"citation" in the library context as a notice of overdue books.
At present, "citation" is a tag we use to identify a field in our
Library Catalog records; obviously this tag does not communicate
effectively to everyone.
3.1.1.4 Search Complexity and Performance
Using transaction logs for INSPEC, we created a model of user
searches to use as a base line for preparing LIS II. We examined
the logs from one of the busiest afternoons of the academic year
to determine the following:
o the number of searches issued per minute
o the number of users on the system simultaneously
o the complexity of user searches, defined as a function of the
number of terms per search; the use of Boolean and proximity
operators, field restrictors and truncation; and instances of
browsing or scanning the index
LIS II will handle 25 simultaneous users generating searches at
the same rate and complexity found in LIS I. The goal is to
provide performance that exceeds current LIS I performance on 70%
of real searches. Users entering searches that exceed
performance guidelines by 50% will be given a resource control
option to cancel or proceed; if the search exceeds the guidelines
by 100%, the user will be given an option to cancel or browse the
index to narrow the search. Resource control is discussed in
Section 3.3 "Distributed Retrieval Architecture."
+ Page 22 +
3.2 Database and Document Types
One of the goals of LIS II is the delivery of complex documents
over the network. While the current implementation of LIS II
supports only ASCII text, both the full text of documents and
structured information such as bibliographic records, over the
next few years, the formats and sources of data available in LIS
II will increase. The focus of our research in this area is on
image databases, full-text databases, and personal databases.
3.2.1 Image Databases
Our first priority is to extend the architecture of our entire
computing environment so that it supports bitmapped images as
well as ASCII text. Information from paper sources, such as
journal articles, will be made available in bitmap format (see
Section 2.1.2 "Machine-Readable Text"). We will use CCITT Fax
Group 4 format to store the compressed images and will provide
software decompression and display tools on the individual
workstation. This area calls for a wide range of research on
storing and displaying images with high resolution, gray scales,
and color. Reasonable display performance of bitmaps depends on
the speed of the decompression algorithm, the caching of data,
and the ability of the decompression algorithm to work ahead of
the user interface. The retrieval of bitmap data has
implications for the retrieval protocol and requires changes in
Z39.50. For example, the application level flow control in
Z39.50 is record oriented, but the size of records containing
bitmapped images may exceed 50 KB, making it necessary either to
retrieve partial records or to retrieve bitmapped images from a
secondary server. The format of the data likewise requires
special handling by the user interface.
+ Page 23 +
3.2.2 Full-Text Databases
In the future, full-text databases with very different indexing
schemes from bibliographic databases will be added to LIS II. As
electronic publishers for Carnegie Mellon, the University
Libraries intend to provide online full-text databases of the
following campus information:
o software licensing and availability information
o career resources information
o Carnegie Mellon policies and procedures manual
o the undergraduate catalog
o Macintosh and Andrew system user help files
o faculty and staff publications and research profiles
o indexes to student and faculty newspapers--perhaps with
full text
Additional full-text databases will include research materials as
well as standard office reference materials, such as phone books,
encyclopedias and dictionaries. Because unpublished working
papers and postings on bulletin boards are of vital importance in
some disciplines, e.g., computer science, LIS II will merge
published and unpublished information. We will provide indexed
access to Carnegie Mellon working papers, and make use of work
that is being carried out on automatic indexing of Arpanet
bulletin boards, so that selected bulletin board postings can
also be added to the retrieval servers.
+ Page 24 +
3.2.3 Personal Databases
The original conception of a library information system was to
bring a search index to a user as a single isolated tool. Our
investigations and interviews with users led to a new conception
based on the knowledge that documents are rarely used alone. The
new understanding is that retrieval technology is an adjunct to
desktop management, therefore a library information system must
be integrated into the larger work environment. There is a
growing tendency among users to want to leave the library
connection active all day rather than log in and out of the
application repeatedly; this trend will have a significant impact
on established system designs, which commit actual hardware to
each connection. With this in mind, we intend to use emerging
standards to link LIS II documents to word processors, databases,
electronic mail, and similar applications. We will provide
toolkits for individual users to make databases available through
LIS II. Using the toolkits, personal database creators will be
able to access their databases through LIS II, or provide their
colleagues with access to their databases through LIS II.
The next challenge in handling document types is storing the
source of a document, e.g., author-contributed text in machine-
readable form. We are acquiring source documents for future
research and development. We will use SGML and CDA to describe
the intellectual structure and content of the document and to
guide the format of the display. An example of the problems to
be solved in this area is the relationship between spreadsheets
and tables for display and page layout. A major area for future
research, but beyond the current plans for LIS II, is the
handling of dynamic documents. Postscript is another format for
non-revisable documents, and we are planning support for it.
+ Page 25 +
3.3 Distributed Retrieval Architecture
The distributed architecture of LIS II requires a range of
support services. The first is a mechanism to identify and
describe databases on the network. Our total Database
Information Service requires a number of features in addition to
those traditionally provided. The long term goal of the Database
Information Service is for users to be able to find information
without knowing which database to search. In conjunction with
this service, the system requires authentication, access control,
and resource control. We need a password and security
(authentication) system for multiple reasons. The primary reason
is to control access to licensed databases, but we must also
limit access to sensitive data within databases, for example, to
social security numbers in the Who's Who at CMU database.
Additionally, authentication of individual users enables us to
collect meaningful statistics about the behavior of different
classes of users, e.g., in different disciplines. The Kerberos
authentication scheme is used as the basis for this service.
Resource control is the final major service required by LIS II.
A distributed architecture designed to be used across
institutions must include a mechanism for limiting the amount of
resources that can be consumed by a remote user. This protects
against abuse, makes it possible to provide subscription services
for licensed databases, and protects users from potentially
costly mistakes by notifying them of expensive requests.
3.4 Developing Standards and Sharing Resources
As an affordable platform for sharing library information, we
expect that LIS II will be expanded in the future. To this end,
we are working with other groups to develop standards that all
libraries can use. This section briefly discusses several
projects in this area. See also the earlier discussion of
sharing enhanced catalog records (see Section 2.2.1.3).
+ Page 26 +
Members of the LIS II development team participate in the Z39.50
Implementors Group. We are lobbying for extensions to the
protocol based on our work with LIS II, where we found it
necessary to extend the protocol by devising local conventions:
o for representing Boolean queries
o for using Z39.50 element set names to provide alternate views
of retrieved records
o for sorting retrieved records on both the retrieval server and
the user's workstation
o for browsing indexes
Further extensions to the protocol may also be necessary, e.g.,
to handle retrieving image data.
Two other projects for testing shared resources are in the
planning stages. The first project, with MIT, Stanford
University, the University of Illinois, and the University of
California, is to build a distributed collection of computer
science technical reports and working papers; the result will be
a full-text database with the items held at separate locations
but with a shared index. Searchable bibliographic records with
abstracts will be provided at each site, with the full text
stored as page images in an image database at the home site. The
second project, with the University of California and
Pennsylvania State University, will test extensions of the Z39.50
protocol by sharing library catalog records; this project is
sponsored by Digital Equipment Corporation.
Additionally, we are working with Andrew system administrators to
implement standards for Motif applications and window management
at Carnegie Mellon. This involves collaboration on user testing
and document preparation so that interactions and terminology are
identical across applications.
4.0 Research and Development Agenda
In conclusion, our LIS II plans for the next year include work in
development, implementation, and research. Each of these is
discussed briefly below, with the items in each section listed in
order of priority.
+ Page 27 +
4.1 Development Plans
o Test the graphical user interface--the number, placement and
design of the windows; the text of error messages, buttons,
menus, online help; the interactions between searching and
browsing; the number and type of indexes to provide for each
database; the information to include in the one-line-per-record
displays; and the sequence of displayed fields in database
records. Several research methods will be used, including
protocol analysis, structured interviews, and user
questionnaires. The results of these studies will affect the
design of the user interface.
o Build a terminal interface for personal computers like the IBM
PC and Apple Macintosh. Because of the popularity of the
Macintosh at Carnegie Mellon, long-term plans include building a
Macintosh interface to LIS II.
o Instrument the system to monitor user behavior based on a
profile of significant characteristics--like college, department
and status (e.g., Fine Arts, Drama , undergraduate); location
where search was issued (e.g., office, public cluster, or
library); database selection; search terms (including operators
and restrictors); browse terms; instances of opening and closing
windows; the number of short (one line per record) and full
records viewed, and the number and sequence of page images
viewed, etc.
o Handle complex documents--using SGML and CDA to describe their
form and content.
4.2 Implementation Plans
o Implement LIS II on distributed file servers and release to
campus.
o Provide training and documentation for library staff and
patrons--to facilitate the shift from a terminal emulation
interface (LIS I) to a workstation interface (LIS II).
o Broaden the range of bibliographic databases available in LIS
II.
+ Page 28 +
o Provide full-text databases--both searchable ASCII text of
campus information and reference works, as discussed in Section
3.2.2 "Full-Text Databases," and displayable page images, as
discussed in Section 3.2.1 "Image Databases." We are focusing on
image databases and will continue experiments with different
scanning, scaling, and compression-decompression algorithms.
4.3 Research Plans
o Evaluate the effects of catalog enhancements on recall and
precision--preliminary results from a pilot study of the current
system (LIS I), planned for Fall 1990, will be used to design a
more rigorous evaluation of catalog enhancements in the new
system (LIS II). We want to assess the number of additional
access points made available in the enhancements, the effects on
retrieval, the effects on relevance judgments, the impact on the
size of the catalog, and the cost per enhancement. The results
of this evaluation should facilitate sharing enhanced catalog
records.
o Evaluate and document the transition from LIS I to LIS II.
o Evaluate user behavior and preferences with LIS II--how skills
develop over time; how acceptance is influenced by user
characteristics, such as social group (student, faculty, staff,
alumni) and discipline (engineering vs. social sciences), and by
various features of the system itself, e.g., multiple windows,
databases, indexes. Results from studies of user characteristics
and skill levels will contribute to the ongoing design of the
system.
o Study how users use full-text databases--for example, given
page images of journal articles or technical reports, do users
read the pages sequentially or skip around in the text? This
study will entail instrumenting the system to monitor user
behavior and running user protocols to better understand why
users do what they do. The results of the study will help us
develop suitable navigational tools and caching procedures for
full-text databases.
+ Page 29 +
References
Van Orden, Richard. "Content Enriched Access to Electronic
Information: Summaries of Selected Research," Library Hi Tech 8,
No. 3 (1990): 28.
About the Author
Denise Troll
Carnegie Mellon University Libraries
Frew Street
Pittsburgh, PA 15213.
BITNET: troll+@andrew.cmu.edu
-----------------------------------------------------------------
The Public-Access Computer Systems Review is an electronic
journal. It is sent free of charge to participants of the
Public-Access Computer Systems Forum (PACS-L), a computer
conference on BITNET. To join PACS-L, send an electronic mail
message to LISTSERV@UHUPVM1 that says: SUBSCRIBE PACS-L First
Name Last Name.
This article is Copyright (C) 1990 by Carnegie Mellon University.
All Rights Reserved.
The Public-Access Computer Systems Review is Copyright (C) 1990
by the University Libraries, University of Houston. All Rights
Reserved.
Copying is permitted for noncommercial use by computer
conferences, individual scholars, and libraries. Libraries are
authorized to add the journal to their collection, in electronic
or printed form, at no charge. This message must appear on all
copied material. All commercial use requires permission.
----------------------------------------------------------------