What is technical text

Abstract Beyond labeling it easier to process than other types, few researchers who use technical text in their work try to define what it is. This paper describes a study that investigates the character of texts typically considered technical. We identify 42 features of a text considered likely to correlate with its degree of technicality. These include both objectively verifiable measures like marked presence of interrogative or imperative sentences which are akin to the criteria used by Biber in Variation Across Speech and Writing , and subjective measures such as presence of hierarchical organization . All are less ambiguous than technicality, so our inventory may be suited to use in a procedure that classifies text as technical or non-technical. An inventory organizing and describing these lexical, syntactic, semantic and discourse features was used to rate nine varied sample texts. Analysis of 22 ratings of each text indicated that 31 features in the inventory were meaningful predictors of text technicality when considered independently. The inventory has been revised and a formula to compute technicality has been developed in the light of these findings.

[1]  Carolyn M. Hall,et al.  Encyclopedia of Library and Information Science , 1971 .

[2]  Roy S. Freedman,et al.  Text classification in fragmented sublanguage domains , 1991, [1991] Proceedings. The Seventh IEEE Conference on Artificial Intelligence Application.

[3]  Douglas Arnold Text typology and machine translation: an overview , 1990 .

[4]  Michael Stubbs Review of Dimensions of register variation: a cross-linguistic comparison by Douglas Biber. Cambridge University Press 1995. , 1997 .

[5]  Andrew Dillon,et al.  Towards a Classification of Text Types: A Repertory Grid Approach , 1990, Int. J. Man Mach. Stud..

[6]  Sergei Nirenburg,et al.  Machine translation: theoretical and methodological issues , 1987 .

[7]  Doug Arnold,et al.  Machine Translation: An Introductory Guide , 1994 .

[8]  John Hutchins,et al.  On the structure of scientific texts , 1977 .

[9]  Neil C. Rowe,et al.  Semiautomatic Disabbreviation of Technical Text , 1995, Inf. Process. Manag..

[10]  Jussi Karlgren,et al.  Recognizing Text Genres With Simple Metrics Using Discriminant Analysis , 1994, COLING.

[11]  Udi Manber,et al.  Flying through hypertext , 1991, HYPERTEXT '91.

[12]  Julian E. Boggess Using a Neural Network for Syntactic Classification of Words in Technical Text , 1992 .

[13]  Richard Kittredge,et al.  Sublanguage : studies of language in restricted semantic domains , 1982 .

[14]  D. Biber A typology of English texts , 1989 .

[15]  Brigitte Roudaud,et al.  Typology Study of French Technical Texts, With a View to Developing a Machine Translation System , 1992, COLING.

[16]  Arved J. Raudkivi,et al.  ANALYSIS OF INFORMATION , 1979 .

[17]  J. Peter Kincaid,et al.  Computer Aids for Authoring Technical Text Written in Controlled English , 1989 .

[18]  John Kontos,et al.  Knowledge Acquisition from Technical Texts Using Attribute Grammars , 1988, Comput. J..

[19]  Chris D. Paice,et al.  The identification of important concepts in highly structured technical papers , 1993, SIGIR.

[20]  O. Kandler,et al.  A natural classification , 1991, Nature.

[21]  Klaus K. Obermeier GROK — a knowledge-based text processing system , 1986, CSC '86.

[22]  Ralph Grishman,et al.  Analyzing language in restricted domains : sublanguage description and processing , 1986 .

[23]  Douglas Biber,et al.  Dimensions of Register Variation: A Cross-Linguistic Comparison , 1995 .

[24]  Douglas Biber,et al.  Using Register-Diversified Corpora for General Language Studies , 1993, Comput. Linguistics.

[25]  M. Wenger,et al.  Reduced Text Structure at Two Text Levels: Impacts on the Performance of Technical Readers , 1993 .

[26]  Jose Luis Cordova A Domain-Independent Approach to Knowledge Acquisition From Natural Language Text , 1992 .