When PRC was founded on mainland China and the KMT retreated to Taiwan in 1949, the relation between mainland China and Taiwan became a classical Cold War instance. Neither travel, visit, nor correspondences were allowed between the people until 1987, when government on both sides started to allow small number of Taiwan people with relatives in China to return to visit through a third location. Although the thawing eventually lead to frequent exchanges, direct travel links, and close commercial ties between Taiwan and mainland China today, 38 years of total isolation from each other did allow the language use to develop into different varieties, which have become a popular topic for mainly lexical studies (e.g., Xu, 1995; Zeng, 1995; Wang & Li, 1996). Grammatical difference of these two variants, however, was not well studied beyond anecdotal observation, partly because the near identity of their grammatical systems. This paper focuses on light verb variations in Mainland and Taiwan variants and finds that the light verbs of these two variants indeed show distributional tendencies. Light verbs are chosen for two reasons: first, they are semantically bleached hence more susceptible to changes and variations. Second, the classification of light verbs is a challenging topic in NLP. We hope our study will contribute to the study of light verbs in Chinese in general. The data adopted for this study was a comparable corpus extracted from Chinese Gigaword Corpus and manually annotated with contextual features that may contribute to light verb variations. A multivariate analysis was conducted to show that for each light verb there is at least one context where the two variants show differences in tendencies (usually the presence/absence of a tendency rather than contrasting tendencies) and can be differentiated. In addition, we carried out a K-Means clustering analysis for the variations and the results are consistent with the multivariate analysis, i.e. the light verbs in Mainland and Taiwan indeed have variations and the variations can be successfully differentiated.
[1]
Chu-Ren Huang,et al.
WORLD CHINESES BASED ON COMPARABLE CORPUS : The Case of Grammatical Variations of jinxing
,
2012
.
[2]
Miriam Butt,et al.
On the (semi)lexical status of light verbs
,
2001
.
[3]
R. Harald Baayen,et al.
Predicting the dative alternation
,
2007
.
[4]
Veronika Vincze,et al.
Full-coverage Identification of English Light Verb Constructions
,
2013,
IJCNLP.
[5]
Chu-Ren Huang,et al.
Annotation and Classification of Light Verbs and Light Verb Variations in Mandarin Chinese
,
2014,
LG-LP@COLING.
[6]
Chu-Ren Huang,et al.
The Ordering of Mandarin Chinese Light Verbs
,
2012,
CLSW.
[7]
Archna Bhatia,et al.
PropBank Annotation of Multilingual Light Verb Constructions
,
2010,
Linguistic Annotation Workshop.
[8]
Dan Roth,et al.
Learning English Light Verb Constructions: Contextual or Statistical
,
2011,
MWE@ACL.
[9]
Ian H. Witten,et al.
The WEKA data mining software: an update
,
2009,
SKDD.
[10]
O. Jespersen.
A modern English grammar on historical principles
,
1928
.
[11]
John Lyons,et al.
Language and linguistics
,
1974,
Language Teaching: Abstracts.
[12]
Antti Arppe,et al.
Univariate, bivariate, and multivariate methods in corpus-based lexicography : A study of synonymy
,
2008
.
[13]
Jia-Fei Hong,et al.
Cross-Strait Lexical Differences: A Comparative Study based on Chinese Gigaword Corpus
,
2013,
Int. J. Comput. Linguistics Chin. Lang. Process..