Defining and automatically identifying words in Chinese

I certify that I have read this dissertation and that in my opinion it meets the academic and professional standard required by the University as a dissertation for the degree of Doctor of Philosophy. Professor in charge of dissertation I certify that I have read this dissertation and that in my opinion it meets the academic and professional standard required by the University as a dissertation for the degree of Doctor of Philosophy. I certify that I have read this dissertation and that in my opinion it meets the academic and professional standard required by the University as a dissertation for the degree of Doctor of Philosophy. I certify that I have read this dissertation and that in my opinion it meets the academic and professional standard required by the University as a dissertation for the degree of Doctor of Philosophy. I certify that I have read this dissertation and that in my opinion it meets the academic and professional standard required by the University as a dissertation for the degree of Doctor of Philosophy. ACKNOWLEDGMENTS First of all, I would like to thank my committee members. Many thanks go to my advisor Peter Cole, without whose encouragement and support this dissertation would not have been possible. I owe Peter more than just the dissertation: I first learned to make syntactic argumentation from Peter and benefited immensely from his insistence on making one's arguments as clear as possible. I would also like to thank Martha Palmer, who not only gave me the opportunity to work on the Chinese Treebank Project at University of Pennsylvania and got me started to think about the issues discussed in this dissertation, but also guided me on the computational aspect of this dissertation. Thanks also go to Rolf Noyer, whose comments on the previous draft of this dissertation lead to substantial improvements. His comments also corrected my misinterpretations of certain parts of the Distributed Morphology theory, the theoretical framework adopted in this dissertation. I learned a great deal from the advanced syntax seminars taught by Gaby Hermon. I benefited from Bill Idsardi's linguistic expertise in general and I would like to thank him for his patience with me in my earlier years in the program before I settled down with a research program. vi Besides my committee members many people have contributed to my education here in the United States. I would like to thank …

[1]  C. Huang Wǒ pǎo de kuài and Chinese Phrase Structure@@@Wo pao de kuai and Chinese Phrase Structure , 1988 .

[2]  Shengli Feng Prosodic structure and prosodically constrained syntax in Chinese : a dissertation in linguistics , 1995 .

[3]  Claudia Ross Adverbial Modification in Mandarin. , 1984 .

[4]  Noam Chomsky,et al.  Lectures on Government and Binding , 1981 .

[5]  Morris Halle,et al.  Distributed morphology and the pieces of inflection , 1993 .

[6]  Stephen R. Anderson,et al.  A-Morphous morphology , 1992 .

[7]  John Xiang-Ling Dai Syntactic, phonological, and morphological words in Chinese , 1998 .

[8]  R. Sproat,et al.  A corpus-based analysis of Mandarin nominal root compound , 1996 .

[9]  Rolf Noyer,et al.  Vietnamese 'Morphology' and the Definition of Word , 1998 .

[10]  Robert Rolf Noyer Features, positions and affixes in autonomous morphological structure , 1992 .

[11]  Xiang-ling Dai,et al.  Chinese Morphology and its Interface with the Syntax , 1992 .

[12]  Martha Palmer,et al.  A Statistically Emergent Approach for Language Processing: Application to Modeling Context Effects in Ambiguous Chinese Word Boundary Perception , 1996, Comput. Linguistics.

[13]  Mark C. Baker,et al.  Incorporation: A Theory of Grammatical Function Changing , 1988 .

[14]  Pascale Fung,et al.  Improving Chinese Tokenization With Linguistic Filters On Statistical Lexical Acquisition , 1994, ANLP.

[15]  Claudia Ross,et al.  Cognate objects and the realization of thematic structure in Mandarin Chinese , 1998 .

[16]  Andi Wu,et al.  Statistically-Enhanced New Word Identification in a Rule-Based Chinese System , 2000, ACL 2000.

[17]  Alec Marantz,et al.  Some key features of distributed morphology , 1994 .

[18]  Charles N. Li,et al.  Mandarin Chinese: A Functional Reference Grammar , 1989 .

[19]  Peter Ackema,et al.  Syntax below zero , 1995 .

[20]  Alec Marantz,et al.  Clitics and phrase structure , 1989 .

[21]  Ray Jackendoff,et al.  X Syntax: A Study of Phrase Structure , 1980 .

[22]  Rolf Noyer,et al.  Locality in Post-Syntactic Operations , 1999 .

[23]  Nianwen Xue,et al.  Developing Guidelines and Ensuring Consistency for Chinese Text Annotation , 2000, LREC.

[24]  Eric Brill,et al.  A corpus-based approach to language learning , 1993 .

[25]  Claudia Ross Compound Nouns in Mandarin. , 1984 .

[26]  Jingqi Fu,et al.  On deriving Chinese derived nominals : Evidence for V-to-N raising , 1994 .

[27]  Tim Stowell,et al.  Origins of phrase structure , 1981 .

[28]  Andi Wu,et al.  Word Segmentation In Sentence Analysis , 1998 .

[29]  E. Williams,et al.  On the definition of word , 1987 .

[30]  Claire Hsun-huei Chang,et al.  V–V compounds in Mandarin Chinese: Argument structure and semantics , 1998 .

[31]  Noam Chomsky,et al.  Remarks on Nominalization , 2020, Nominalization.

[32]  Alec Marantz,et al.  Clitics, morphological merger, and the mapping to phonological structure , 1988 .

[33]  Sophia Ananiadou,et al.  On the definition of word , 2004, Machine Translation.

[34]  L. Cheng de in Mandarin , 1986, Canadian Journal of Linguistics/Revue canadienne de linguistique.

[35]  Cheng-Teh James Huang,et al.  Logical Relations in Chinese and the Theory of Grammar , 1998 .

[36]  A. Zwicky,et al.  Syntactic wordsand n*"1orphological words, simple and composite , 1990 .

[37]  趙 元任,et al.  A grammar of spoken Chinese = 中國話的文法 , 1968 .

[38]  Koenraad Kuiper,et al.  On defining the Chinese compound word: Headedness in Chinese compounding and Chinese VR compounds , 1998 .

[39]  Yafei Li,et al.  On V-V compounds in Chinese , 1990 .

[40]  Chris Brew,et al.  Error-Driven Learning of Chinese Word Segmentation , 1998, PACLIC.

[41]  李幼升,et al.  Ph , 1989 .

[42]  Yafei Li,et al.  Chinese resultative constructions and the Uniformity of Theta Assignment Hypothesis , 1998 .

[43]  Richard Sproat,et al.  A statistical method for finding word boundaries in Chinese text , 1990 .

[44]  J. Zwart The Minimalist Program , 1998, Journal of Linguistics.

[45]  Chilin Shih,et al.  A Stochastic Finite-State Word-Segmentation Algorithm for Chinese , 1994, ACL.

[46]  David D. Palmer,et al.  A Trainable Rule-Based Algorithm for Word Segmentation , 1997, ACL.

[47]  Ray Jackendoff,et al.  Semantic Interpretation in Generative Grammar , 1972 .

[48]  Alec Marantz,et al.  No escape from syntax: Don't try morphological analysis in the privacy of your own lexicon , 1997 .

[49]  Shuanfan Huang,et al.  Chinese as a headless language in compounding morphology , 1998 .

[50]  C. Huang,et al.  Pro-Drop in Chinese: A Generalized Control Theory , 1989 .

[51]  C. Kitagawa,et al.  Prenominal Modification in Chinese and Japanese , 1982 .