Applying Machine Learning Toward an Automatic Classification of It

In the majority of cases, the pronoun it' illustrates nominal anaphora, tending to refer back to another noun phrase in the text. However, in a significant minority of cases, the pronoun is used in exceptional ways that fail to demonstrate strict nominal anaphora. The identification of these uses of it' is important in all fields where pronoun resolution has an impact. After a survey of previous treatments of the pronoun it in the literature, some features of instances of it' are proposed that can be used in a novel memory-based learning method to automatically classify those instances. On evaluating the method, it is found that the implemented system performs comparably well with respect to a rule-based system, and with an extended training set it is expected that the accuracy of the system will improve, offering greater coverage than rule-based methods.

[1]  Sanda M. Harabagiu,et al.  Knowledge-Lean Coreference Resolution and its Relation to Textual Cohesion and Coherence , 1999, Workshop On The Relation Of Discourse/Dialogue Structure And Reference.

[2]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[3]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[4]  Geoffrey Sampson,et al.  English for the Computer: The SUSANNE Corpus and Analytic Scheme , 1995, Computational Linguistics.

[5]  Richard Evans,et al.  Enhancing Preference-Based Anaphora Resolution with Genetic Algorithms , 2000, Natural Language Processing.

[6]  Manuel Palomar,et al.  Anaphora resolution and generation in a multilingual system . An Interlingua mechanism , 1999 .

[7]  Michel J. Denber,et al.  A utomatic Resolution of Anaphora in English , 1998 .

[8]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[9]  Timo Järvinen,et al.  A non-projective dependency parser , 1997, ANLP.

[10]  Simon Kerl A comprehensive grammar of the English language , .

[11]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[12]  Branimir Boguraev,et al.  Anaphora for Everyone: Pronominal Anaphora Resolution without a Parser , 1996, COLING.

[13]  C. D. Paice,et al.  Towards the automatic recognition of anaphoric features in English text: the impersonal pronoun “it” , 1987 .

[14]  Shalom Lappin,et al.  An Algorithm for Pronominal Anaphora Resolution , 1994, CL.

[15]  Richard Evans A Comparison of Rule-Based and Machine Learning Methods for Identifying Non-nominal It , 2000, Natural Language Processing.

[16]  Ruslan Mitkov,et al.  Robust Pronoun Resolution with Limited Knowledge , 1998, ACL.

[17]  Breck Baldwin,et al.  CogNIAC: high precision coreference with limited knowledge and linguistic resources , 1997 .

[18]  Diane J. Litman,et al.  Cue Phrase Classification Using Machine Learning , 1996, J. Artif. Intell. Res..

[19]  Donna K. Byron,et al.  Resolving Pronominal Reference to Abstract Entities , 2002, ACL.

[20]  Geoffrey Sampson,et al.  English for the Computer: The SUSANNE Corpus and Analytic Scheme , 1995, Computational Linguistics.

[21]  John Hale,et al.  A Statistical Approach to Anaphora Resolution , 1998, VLC@COLING/ACL.

[22]  Nancy Ide,et al.  Veins Theory: A Model of Global Discourse Cohesion and Coherence , 1998, ACL.

[23]  李幼升,et al.  Ph , 1989 .

[24]  Jan Svartvik,et al.  A __ comprehensive grammar of the English language , 1988 .

[25]  Michael Swan,et al.  Practical English Usage , 1980 .

[26]  Izumi Tanaka,et al.  The Value of an Annotated Corpus in the Investigation of Anaphoric Pronouns : With Particular Reference to Backwards Anaphora in English. , 2000 .