AutoNote2: Network Mediated Natural Language Communication In A Personal Information Retrieval System

Natural language combinesnOuns and adjectives into noun phrases,, and links phrases by means of.p,repositions to form complex descriptiops of objects and topics. AUTONOTEZ, a file-orsented retrieval systeq, allows the user to employ such descriptions to characterize the items of information he wishes to store and retrieve. Tn addition. the system also cmstructs a network qpresentation of the user's sub3ect matter, using syntactic analysis to derive dependency structures fxhn h h descriptions. The depe~dency information, expressed as subordinate and coordinate linkages among the phrases, is representea by a tree of nodes, with simple phrases a t the terminalbranches. The PARSER uses the network to digambiguate dew criptions, querying the user only a b h t regidual ambiguities. Associated with the PARSER is a network LOCATOR, which determines whether a ckrent user description refers to an existing topic at some level in the network. The LOCATOR also builds a table specifying t;he changes, if anp, to be mede in a network in order to represent the topicdnferred from the current input description. For example, if the user's description contains one or more simple phrases (thereafter referred to as active) directly describing at least one existing node in the network, the description as a whole quite likely references an exssting network topic. To locate 1t, the PARSER fgrs't deterdaes tlie focus phrase, the active phrase at the highest dependency level. The nodes directly described by the focus phrase are w e d to generate candidate topice. These then are matched against the remaining active phrases obtained from the description to determine the most l ike ly referent. Manp gf the procedures employed in dsScription and representation also are wed in network-mediated' retrieval. The user may* in i t iate retrieval with a FIND comm~nd, supplying a descripti~n as afgument. The resultant phrase table is passed along to the network LOCATOR, which returns a node ntllhber to the FIND processor. The FIND processor constructs a set of i t e m numbers by extracting the tkxttral refereaces from the node. The system then checks for upward pointers from the node. If there ars.tm.-uctura~ly xelsted topics, the FIND processor so informs the user. Note that by virtue of netwprk midiation of retrieval, i f a user descr ipt ion Ys imprecise or incorrect, the systemmay be able t o direct the user to relevant related topics. meri the system queries the user about a topic, f o r example to determine the intent-of a descriptPon, the eapic node number is passed to a SPEAKER component. A phraeal description of the node is returned. To minimize redundant c m i c a t i o n , a level indicator may Qe set according to the level of &tail in the user's description. For example, if the user describes an item as RESULTS OF TBE WPERIMWT and the systemamst ask it he i n rezerring t o SMITH'S EXPERIMENT ON SHORT TERM MEMIRY OF WHITE RATS, the resulting query would be: ARE YOU W R R I N G TO SMITH'S EXPERIMENT ON W O R P ? Cmetruct5on of a desclr;iption from the network takes place in mob stages. The fitst stage steps thorugh the network recurs&vely, collecting the e i q l e phrases that directly or indirectly describe fhe speoif ied node. The l e v e l indicator blocks collection of simple phrasee below the specified level. The second stage is carried out by a recursive algorithm that operaqes on the tabled simple phraygs and their interrelations to construct the phrasal description. The last major component of the system handle6 network modification and reorganization. This enables the user t o add or remove references and phrases, and t o modify,, delete, or reorganize h i s t o p i c structure. A de ta i l ed ease study comparing AUTONOTE2 with a good keyword-based retrieval aystxm showed that fol: a coherent body of material, the comunicati,ve efficiency 0% AUTONOTEL, as measuredfbp the ratio of the number o f pords conveyed to the number of words entered, was more than double that of the kyword-based system. Retrieval capability was enhanced considerably, and the tepresentati~n dewqrk effectxvely distinguished among the many topics partially indexed by the same words. Furthermore, SPEAKER output of topics from the rep~esentatiorral network proved a useful retrieval intermediary, greatly reducing the need fo,r perusal of i t e m texts.