Semiautomatic Disabbreviation of Technical Text

Abstract Abbreviations adversely affect information retrieval and text comprehensibility. We describe a software tool to decipher abbrevations by finding their whole-word equivalents or “disabbreviations”. It uses a large English dictionary and a rule-based system to guess the most-likely candidates, with users having final approval. The rule-based system uses a variety of knowledge to limit its search, including phonetics, known methods of constructing multiword abbrevations, and analogies to previous abbreviations. The tool is especially helpful for retrieval from computer programs, a form of technical text in which abbreviations are notoriously common; disabbreviation of programs can make programs more reusable, improving software engineering. It also helps decipher the often-specialized abbreviations in technical captions. Experimental results confirm that the prototype tool is easy to use, finds many correct disabbreviations, and improves text comprehensibility.

[1]  Charles P. Bourne,et al.  A Study of Methods for Systematically Abbreviating English Words and Names , 1961, JACM.

[2]  K. Laitinen,et al.  DNN-disciplined natural naming: a method for systematic name creation in software development , 1992, Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences.

[3]  Ben Shneiderman,et al.  Software psychology: Human factors in computer and information systems (Winthrop computer systems series) , 1980 .

[4]  Karen Kukich,et al.  Techniques for automatically correcting words in text , 1992, CSUR.

[5]  Laurence Mark Weissman,et al.  A methodology for studying the psychological complexity of computer programs. , 1974 .

[6]  A. M. Ibrahim Acronyms observed , 1989 .

[7]  Eugene J. Guglielmo,et al.  Exploiting Captions in Retrieval of Multimedia Data , 1993, Inf. Process. Manag..

[8]  Jerry M. Rosenberg McGraw-Hill Dictionary of Information Technology and Computer Acronyms, Initials, and Abbreviations , 1991 .

[9]  Jeremy Peckham,et al.  Recent Developments and Applications of Natural Language Processing , 1989 .

[10]  James L. Peterson,et al.  Computer programs for detecting and correcting spelling errors , 1980, CACM.

[11]  Steven L. Lytinen,et al.  Extracting knowledge from diagnostic databases , 1993, IEEE Expert.

[12]  Kari Laitinen,et al.  Using Natural Naming in Programming: Feedback from Practioners , 1992, Annual Workshop of the Psychology of Programming Interest Group.

[13]  J. E. Sammet,et al.  Software psychology: human factors in computer and information systems , 1983, SGCH.

[14]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[15]  P. Buneman,et al.  A basis for interactive schema merging , 1992, Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences.