Access Techniques for Document Image Databases.
暂无分享,去创建一个
INTHE MOST GENERAL SENSE, “access” evokes the paradigm of a seeker of information asking a question of a machine which searches for and retrieves an answer. In a more practical vein, this entails accessing a bibliographic database by entering a query comprising key words or phrases, either free text or terms out of a controlled vocabulary, and receiving citations to the literature. In a database consisting of images, say bitmapped digital images of documents stored on high density media such as optical disc, automated access actually may be done in several ways. One way is for the user to first search a bibliographic database, after which the system retrieves citations and links these to corresponding document images on optical disc. Another way is to browse a list of stored document titles, to select one and continue the search through another list at a lower level (e.g., a table of contents in a monograph or a list of articles in a journal issue); then, on making a selection from this latter list, to be presented with the document image retrieved from electronic storage. A third way is to perform a “full-text” search of the machine-readable areas of the stored documents and, then have the system retrieve and integrate the text and graphic regions to form composite images that appear similar to the original paper documents. This article describes the access and retrieval techniques implemented as part of a research and development program in electronic imaging (EI) applied to document storage and retrieval applications at the National Library of Medicine (NLM).
[1] George R. Thoma,et al. A prototype system for the electronic storage and retrieval of document images , 1985, TOIS.