David C. Martin, Cheong S. Ang, Marc E. Solomon, and
Michael D. Doyle, Ph.D.
Library and Center for Knowledge Management
University of California, San Francisco
As one of the nations leading medical libraries and a pioneer in digital information dissemination, the Library and Center for Knowledge Management (CKM) at the University of California, San Francisco (UCSF), has both collaborated with industry partners and developed prototype systems, to begin the realization of a knowledge management environment (KME) an environment that exploits the power of distributed hypermedia documents and distributed computational processing, integrating access to the entire universe of digital information. Two specific research efforts are underway: the Red Sage Project and GALEN II prototyping.
In 1992, the three partners agreed to proceed forward with an electronic journal project at the Red Sage restaurant in Washington, D.C. (hence the project name); AT&T providing the software, Springer-Verlag the content, and CKM providing the test-bed. The primary goal of the Red Sage Project is to explore the technical, legal, economic, business, and social issues surrounding the delivery of scientific, technical, and medical information in a network environment. To achieve this goal, a critical mass of content is required to make the system as appealing as possible to the target user community: the faculty, staff and students of UCSF and its affiliated institutions. The content database has been expanded to include additional publishers including John Wiley and Sons, the Massachusettes Medical Society, and the American Medical Association, but is limited to three categories: radiology, molecular biology, and general, high-impact (including New England Journal of Medicine, JAMA, and Lancet). These journals provide content of interest to the researcher, clinician and general reader. The database consists of over sixty titles, from more than twenty publishers and will generally be available one to eight weeks prior to the receipt of printed equivalents critical when key information may drastically influence experimental and clinical programs.
The Red Sage Project has been assembled from a variety of sources. The RightPages software from AT&T, developed at the Bell Laboratory, provided the initial technology: a client/server model with a central server repository. The client software is a graphical user-interface (GUI) application for UNIX, Macintosh and DOS computers under Motif(TM), the Macintosh Finder and Microsoft Windows GUI environments. The RightPages server provides access to journal page images at the request of the client and manages content receipt, maintains user-profiles to determine applicable content and alerts users via electronic mail. Journal content is accessed via a proprietary connection-based protocol that provides mechanisms to authenticate users, navigate the content hierarchy, query the text database, and select pages for viewing and printing.
A central super-minicomputer provides the data store for the journal content, computational support for the server processes that build, maintain and search content on behalf of client requests, and host processing for a number of directly supported access terminals. The current configuration is a four processor UNIX computer with forty-eight (48) G-bytes of magnetic disk and a 340x1.3G-byte magneto-optical (MO) jukebox. A hierarchical file system (HFS) maintains the most recently used content on the fastest media.
The main features of the system provide the user with a traditional library metaphor. Scanned cover icons guide the user from titles to volumes to issues and finally to articles. Articles are selected from the table of contents page via geographically defined active regions around each article listing. User page through each article, scrolling vertically and horizontally as necessary to view the entire page. Pages are available in 72, 100 and 300 dot-per-inch versions, allowing users to determine their reading comfort level, albeit with some increase in network transmission time. Users may also search for specific articles, specifying title and date restrictions in addition to general Boolean full-text queries. This same searching functionality may be used for alerting as well; allowing the system to identify articles of interest from newly arrived content. Finally, users may optionally specify a list of subscribed titles requesting notification of new issues. Users are alerted to new content, both subscriptions and alerts, via electronic mail.
The Library and Center for Knowledge Management is evaluating the RightPages system and the user communities reaction. Statistics are being accumulated on usage and use patterns: e.g. content alerts, search results, articles requested, subscription lists, pages viewed and pages printed. In addition, software engineers within the Center for Knowledge Management are developing prototype client applications for both the Red Sage content and the burgeoning data available on the Internet.
The Library and Center have worked in concert to develop alternatives to existing tools and design innovative new tools for network information discovery and retrieval. The Center has utilized wide-area information servers (WAIS), the World-Wide-Web (WWW), and other freely available software to develop several interfaces, including an alternative to the RightPages client used in the Red Sage Project. These prototype systems are being developed as demonstrations of the capabilities available to the campus faculty, staff and students and will be available during the planning phase of the GALEN II Project, involving representatives from through-out the campus.
The prototypes under active development at this time include an alternative RightPages client, a distributed visualization server (VIS), a Brookhaven Protein Data Bank (PDB) database with integrated three-dimensional imaging, and graphical interface to OPACs. The prototype systems make use of WAIS to search full-text, POSTGRES to maintain bibliographic information, and the World-Wide-Web and Mosaic to provide the integration and user-interface. The systems are searchable by title, author, abstract and medical subject headings (MeSH standard medical terminology defined by National Library of Medicine (NLM).
The Center is also exploring the technical feasibility of using Mosaic and WWW to integrate access between the published journal literature from the Red Sage Project with the developing monolithic datasets of biomedically-relevant structure information. Such datasets generally share the common characteristic of requiring sophisticated visualizations tools for browsing, searching and analysis. The Visible Human Project, the Visible Embryo Project, the Brain Mapping Project, the Human Genome Project and other massive research efforts are producing increasingly monolithic databases that defy routine data exploration and interpretation.
The task of integrating access to such massive information and computational resources is nontrivial. One embryo, of the more than 650 serially sectioned specimens in the Visible Embryo Project, yields as much as a terabyte of anatomical volume data (n.b. 20mm specimen, sectioned at 5 microns and digitized at 8000x8000 pixels/section and 36 RGB bits/pixel yields 1.073 T-bytes). Clearly no single workstation or supercomputer can manipulate, process and analyze such a dataset as a single unit, much less perform computational operations on a database of embryos. The Center has developed tools that allow integrated, ubiquitous access (through NCSAs Mosaic) to network visualization servers that can distribute the computational load across a loosely-coupled network of both general purpose and specialized graphic workstations and supercomputers. Therefore text, images, audio, video, and real-time interactive visualizations may be embedded within hyperdocuments located anywhere on the network. We are also exploring the use of this technology for delivering interactive virtual reality (VR) across the Internet. The success of these efforts will allow wide-spread access to highly accurate and complex simulations of human development, non-invasive surgery, and other VR applications via inexpensive workstations and personal computers.