Sixth International Joint Conference on Natural Language Processing

This paper deals with the fast bootstrapping of Grapheme-to-Phoneme (G2P) conversion system, which is a key module for both automatic speech recognition (ASR), and text-to-speech synthesis (TTS). The idea is to exploit language contact between a local dominant language (Malay) and a very under-resourced language (Iban spoken in Sarawak and in several parts of the Borneo Island) for which no resource nor knowledge is really available. More precisely, a pre-existing Malay G2P is used to produce phoneme sequences of Iban words. The phonemes are then manually post-edited (corrected) by an Iban native. This resource, which has been produced in a semi-supervised fashion, is later used to train the first G2P system for Iban language. As a by-product of this methodology, the analysis of the “pronunciation distance” between Malay and Iban enlighten the phonological and orthographic relations between these two languages. The experiments conducted show that a rather efficient Iban G2P system can be obtained after only two hours of post-edition (correction) of the output of Malay G2P applied to Iban words.

[1]  Publisher Kunderi Mahaboob,et al.  US-China Foreign Language , 2014 .

[2]  Kareem Darwish,et al.  Statistical Machine Translation , 2014, NLP of Semitic Languages.

[3]  Muhammad Waqas Anwar,et al.  Urdu Spell Checking: Reverse Edit Distance Approach , 2013 .

[4]  Erik Cambria,et al.  Commonsense-based topic modeling , 2013, WISDOM '13.

[5]  Richard Johansson,et al.  Relational Features in Fine-Grained Opinion Analysis , 2013, CL.

[6]  K. Scherer,et al.  Components of emotional meaning: A sourcebook , 2013 .

[7]  Kuan-Yu Chen,et al.  Leveraging relevance cues for language modeling in speech recognition , 2013, Inf. Process. Manag..

[8]  Sivaji Bandyopadhyay,et al.  Music Genre Classification: A Semi-supervised Approach , 2013, MCPR.

[9]  Josef Steinberger,et al.  Sentiment Analysis in Czech Social Media Using Supervised Machine Learning , 2013, WASSA@NAACL-HLT.

[10]  Anna Feldman,et al.  Evaluating and automating the annotation of a learner corpus , 2013, Language Resources and Evaluation.

[11]  Hung Quoc Ngo,et al.  Automatic Searching for English-Vietnamese Documents on the Internet , 2012, WSSANLP@COLING.

[12]  Werner Winiwarter,et al.  Building an English-Vietnamese Bilingual Corpus for Machine Translation , 2012, 2012 International Conference on Asian Language Processing.

[13]  Tao Lin,et al.  A rule based Chinese spelling and grammar detection system utility , 2012, 2012 International Conference on System Science and Engineering (ICSSE).

[14]  Katsuhito Sudoh,et al.  Zero Pronoun Resolution can Improve the Quality of J-E Translation , 2012, SSST@ACL.

[15]  Regina Barzilay,et al.  Learning High-Level Planning from Text , 2012, ACL.

[16]  Ondrej Dusek,et al.  DEPFIX: A System for Automatic Correction of Czech MT Outputs , 2012, WMT@NAACL-HLT.

[17]  Stefan M. Rüger,et al.  Weakly Supervised Joint Sentiment-Topic Detection from Text , 2012, IEEE Transactions on Knowledge and Data Engineering.

[18]  Claire François,et al.  Analyzing the Impact of Prevalence on the Evaluation of a Manual Annotation Campaign , 2012, LREC.

[19]  Kuan-Yu Chen,et al.  Spoken Document Retrieval Leveraging Unsupervised and Supervised Topic Modeling Techniques , 2012, IEICE Trans. Inf. Syst..

[20]  Min-Yen Kan,et al.  Perspectives on crowdsourcing annotations for natural language processing , 2012, Language Resources and Evaluation.

[21]  K. R. Chandran,et al.  Normalized Web Distance Based Web Query Classification , 2012 .

[22]  Eiichiro Sumita,et al.  Overview of the Patent Machine Translation Task at the NTCIR-10 Workshop , 2011, NTCIR.

[23]  Francis Bond,et al.  Building and Annotating the Linguistically Diverse NTU-MC (NTU-Multilingual Corpus) , 2011, PACLIC.

[24]  Graham Neubig,et al.  Training Dependency Parsers from Partially Annotated Corpora , 2011, IJCNLP.

[25]  Themis Palpanas,et al.  Survey on mining subjective data on the web , 2011, Data Mining and Knowledge Discovery.

[26]  Sara Stymne,et al.  Spell Checking Techniques for Replacement of Unknown Words and Data Cleaning for Haitian Creole SMS Translation , 2011, WMT@EMNLP.

[27]  R. Iida,et al.  Slate - A Tool for Creating and Maintaining Annotated Corpora , 2011, J. Lang. Technol. Comput. Linguistics.

[28]  Weiwei Sun,et al.  A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging , 2011, ACL.

[29]  Graham Neubig,et al.  Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis , 2011, ACL.

[30]  C.-Y. Lee,et al.  Visually and Phonologically Similar Characters in Incorrect Chinese Words: Analyses, Identification, and Applications , 2011, TALIP.

[31]  Kuan-Yu Chen,et al.  Relevance language modeling for speech recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32]  Tsun Ku,et al.  Improve the detection of improperly used Chinese characters in students’ essays with error model , 2011 .

[33]  Kevin Duh,et al.  Automatic Evaluation of Translation Quality for Distant Language Pairs , 2010, EMNLP.

[34]  Tsun Ku,et al.  Reducing the False Alarm Rate of Chinese Character Error Detection and Correction , 2010, CIPS-SIGHAN.

[35]  Kun Yu,et al.  Semi-automatically Developing Chinese HPSG Grammar from the Penn Chinese Treebank for Deep Parsing , 2010, COLING.

[36]  Chao-Lin Liu,et al.  Visually and Phonologically Similar Characters in Incorrect Simplified Chinese Words , 2010, COLING.

[37]  Rolf Schwitter,et al.  Controlled Natural Languages for Knowledge Representation , 2010, COLING.

[38]  Ari Rappoport,et al.  Enhanced Sentiment Learning Using Twitter Hashtags and Smileys , 2010, COLING.

[39]  Sivaji Bandyopadhyay,et al.  Clause Identification and Classification in Bengali , 2010 .

[40]  Pushpak Bhattacharyya,et al.  Hybrid Stemmer for Gujarati , 2010 .

[41]  Chung-Hsien Wu,et al.  Sentence Correction Incorporating Relative Position and Parse Template Language Models , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[42]  Dipankar Das,et al.  Automatic Extraction of Complex Predicates in Bengali , 2010, MWE@COLING.

[43]  Mathias Rossignol,et al.  An empirical study of maximum entropy approach for part-of-speech tagging of Vietnamese texts , 2010, JEPTALNRECITAL.

[44]  Cecilia Ovesdotter Alm Characteristics of High Agreement Affect Annotation in Text , 2010, Linguistic Annotation Workshop.

[45]  Nadir Durrani,et al.  Hindi-to-Urdu Machine Translation through Transliteration , 2010, ACL.

[46]  Daniel Marcu,et al.  Hierarchical Search for Word Alignment , 2010, ACL.

[47]  Udo Hahn,et al.  A Cognitive Cost Model of Annotations Based on Eye-Tracking Data , 2010, ACL.

[48]  Ekaterina Shutova,et al.  Models of Metaphor in NLP , 2010, ACL.

[49]  Suzana Loskovska,et al.  Clinical Decision Support Systems: Medical knowledge acquisition and representation methods , 2010, 2010 IEEE International Conference on Electro/Information Technology.

[50]  Simone Teufel,et al.  Metaphor Corpus Annotated for Source - Target Domain Mappings , 2010, LREC.

[51]  Chu-Ren Huang,et al.  Emotion Cause Events: Corpus Construction and Analysis , 2010, LREC.

[52]  Hanae Koiso,et al.  Design, Compilation, and Preliminary Analyses of Balanced Corpus of Contemporary Written Japanese , 2010, LREC.

[53]  Son Bao Pham,et al.  Named Entity Recognition for Vietnamese , 2010, ACIIDS.

[54]  David Rosengrant,et al.  Gaze scribing in physics problem solving , 2010, ETRA.

[55]  Kuan-Yu Chen,et al.  Latent topic modeling of word vicinity information for speech recognition , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[56]  Colin Cherry,et al.  Statistical machine translation , 2010, MTSUMMIT.

[57]  R. Mitton Fifty years of spellchecking , 2010 .

[58]  Dipti Misra Sharma,et al.  A Modular Cascaded Approach to Complete Parsing , 2009, 2009 International Conference on Asian Language Processing.

[59]  Kun Yu,et al.  Design of Chinese HPSG Framework for Data-Driven Parsing , 2009, PACLIC.

[60]  Andreas F. Ehmann,et al.  Lyric Text Mining in Music Mood Classification , 2009, ISMIR.

[61]  James Caverlee,et al.  PageRank for ranking authors in co-citation networks , 2009, J. Assoc. Inf. Sci. Technol..

[62]  Eiichiro Sumita,et al.  Transliteration by Bidirectional Statistical Machine Translation , 2009, NEWS@IJCNLP.

[63]  Peter Nabende,et al.  Transliteration System Using Pair HMM with Weighted FSTs , 2009, NEWS@IJCNLP.

[64]  Chu-Ren Huang,et al.  A Cognitive-based Annotation System for Emotion Computing , 2009, Linguistic Annotation Workshop.

[65]  Akira Shimazu,et al.  An Empirical Study of Vietnamese Noun Phrase Chunking with Discriminative Sequence Models , 2009, ALR7@IJCNLP.

[66]  Mark Steedman,et al.  Unbounded Dependency Recovery for Parser Evaluation , 2009, EMNLP.

[67]  Monojit Choudhury,et al.  Complex Linguistic Annotation – No Easy Way Out! A Case from Bangla and Hindi POS Labeling Tasks , 2009, Linguistic Annotation Workshop.

[68]  Chao-Lin Liu,et al.  Phonological and Logographic Influences on Errors in Written Chinese Words , 2009, ALR7@IJCNLP.

[69]  Hsin-Hsi Chen,et al.  Using Morphological and Syntactic Structures for Chinese Opinion Analysis , 2009, EMNLP.

[70]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[71]  Fei Huang,et al.  Confidence Measure for Word Alignment , 2009, ACL.

[72]  Katrin Erk,et al.  Investigations on Word Senses and Word Usages , 2009, ACL.

[73]  Christian Chiarcos,et al.  ANNIS: A Search Tool for Multi-Layer Annotated Corpora , 2009 .

[74]  Rohini K. Srihari,et al.  OpinionMiner: a novel machine learning system for web opinion mining and extraction , 2009, KDD.

[75]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[76]  Paul M. B. Vitányi,et al.  Normalized Web Distance and Word Similarity , 2009, Handbook of Natural Language Processing.

[77]  Hendrik Johannes Groenewald,et al.  Using Technology Transfer to Advance Automatic Lemmatisation for Setswana , 2009 .

[78]  Shuly Wintner,et al.  Lightly Supervised Transliteration for Machine Translation , 2009, EACL.

[79]  Berlin Chen,et al.  Word Topic Models for Spoken Document Retrieval and Transcription , 2009, TALIP.

[80]  Svetlana Yanushkevich,et al.  Error Detection and Error Correction , 2009 .

[81]  RandolphM. Nesse,et al.  Evolution, emotions, and emotional disorders. , 2009, The American psychologist.

[82]  Chung-Hsien Wu,et al.  Word Order Correction for Language Transfer Using Relative Position Language Modeling , 2008, 2008 6th International Symposium on Chinese Spoken Language Processing.

[83]  Jens Grivolla,et al.  Multimodal Music Mood Classification Using Audio and Lyrics , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[84]  Yuzhong Qu,et al.  An Integrated Approach for Automatic Construction of Bilingual Chinese-English WordNet , 2008, ASWC.

[85]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[86]  Richard Johansson,et al.  Dependency-based Semantic Role Labeling of PropBank , 2008, EMNLP.

[87]  Iñaki Alegria,et al.  Chunk and Clause Identification for Basque by Filtering and Ranking with Perceptrons , 2008, Proces. del Leng. Natural.

[88]  Tanveer J. Siddiqui,et al.  An unsupervised Hindi stemmer with heuristic improvements , 2008, AND '08.

[89]  Wen-Lian Hsu,et al.  An alignment-based surface pattern for a question answering system , 2008, 2008 IEEE International Conference on Information Reuse and Integration.

[90]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[91]  Qun Liu,et al.  Forest-Based Translation , 2008, ACL.

[92]  Hô Tuòng Vinh,et al.  A Hybrid Approach to Word Segmentation of Vietnamese Texts , 2008, LATA.

[93]  Claire Cardie,et al.  Annotating Topics of Opinions , 2008, LREC.

[94]  Sobha Lalitha Devi,et al.  Clause Boundary Identification Using Conditional Random Fields , 2008, CICLing.

[95]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[96]  Y. Poortinga,et al.  Multilevel analysis of individuals and cultures , 2008 .

[97]  Ivan Titov,et al.  Modeling online reviews with multi-grain topic models , 2008, WWW.

[98]  Manuela M. Veloso,et al.  Feature selection in conditional random fields for activity recognition , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[99]  K. Scherer,et al.  The World of Emotions is not Two-Dimensional , 2007, Psychological science.

[100]  Alexander M. Fraser,et al.  Squibs and Discussions: Measuring Word Alignment Quality for Statistical Machine Translation , 2007, CL.

[101]  Vincent Ng,et al.  Unsupervised morphological parsing of Bengali , 2007, Lang. Resour. Evaluation.

[102]  Mei-Chen Wu,et al.  Error Detection and Correction Based on Chinese Phonemic Alphabet in Chinese Text , 2007, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[103]  Jonathon Read,et al.  Annotating expressions of Appraisal in English , 2007, Language Resources and Evaluation.

[104]  Kam-Fai Wong,et al.  Annotating Chinese Collocations with Multi Information , 2007, LAW@ACL.

[105]  Martha Palmer,et al.  Criteria for the Manual Grouping of Verb Senses , 2007, LAW@ACL.

[106]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[107]  Shankar Kumar,et al.  Improving Word Alignment with Bridge Languages , 2007, EMNLP.

[108]  Sankar K. Pal,et al.  Stemming via Distribution-Based Word Segregation for Classification and Retrieval , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[109]  Bao-Quoc Ho,et al.  Automatic Construction of English-Vietnamese Parallel Corpus through Web Mining , 2007, 2007 IEEE International Conference on Research, Innovation and Vision for the Future.

[110]  Nigel Collier,et al.  Named Entity Recognition in Vietnamese documents , 2007 .

[111]  Mai Miyabe,et al.  Parallel-Text Based Support System for Intercultural Communication at Medical Receptions , 2007, IWIC.

[112]  Christiane Fellbaum,et al.  Connecting the Universal to the Specific: Towards the Global Grid , 2007, IWIC.

[113]  Zhengxin Chen,et al.  From Data Mining to Behavior Mining , 2006, Int. J. Inf. Technol. Decis. Mak..

[114]  A. Zaenen Last Words: Mark-up Barking Up the Wrong Tree , 2006, CL.

[115]  D. Hung,et al.  The temporal signatures of semantic and phonological activations for Chinese sublexical processing: An event-related potential study , 2006, Brain Research.

[116]  Amy Bruckman,et al.  Teaching Students to Study Online Communities Ethically , 2006 .

[117]  Yi-Hsuan Yang,et al.  Music emotion classification: a fuzzy approach , 2006, MM '06.

[118]  Harald Hammarström,et al.  Poor Man's Stemming: Unsupervised Recognition of Same-Stem Words , 2006, AIRS.

[119]  Wen-Lian Hsu,et al.  A Semi-Automatic Method for Annotating a Biomedical Proposition Bank , 2006 .

[120]  Yaser Al-Onaizan,et al.  Distortion Models for Statistical Machine Translation , 2006, ACL.

[121]  S. Naskar,et al.  A Modified Joint Source-Channel Model for Transliteration , 2006, ACL.

[122]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[123]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[124]  Farooq Ahmad,et al.  Learning a Spelling Error Model from Search Query Logs , 2005, HLT.

[125]  Diana Inkpen,et al.  Semantic Similarity for Detecting Recognition Errors in Automatic Speech Transcripts , 2005, HLT.

[126]  Utpal Garain,et al.  An approach for stemming in symbolically compressed Indian language imaged documents , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[127]  Jonathon Read,et al.  Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification , 2005, ACL.

[128]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[129]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[130]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[131]  Agneta H. Fischer,et al.  Emotion in Social Relations , 2004 .

[132]  Alexander F. Gelbukh,et al.  Detecting Inflection Patterns in Natural Language by Minimization of Morphological Model , 2004, CIARP.

[133]  Ben Shneiderman,et al.  Designing for fun: how can we design user interfaces to be more fun? , 2004, INTR.

[134]  Sarmad Hussain,et al.  Letter-to-Sound Conversion for Urdu Text-to-Speech System , 2004, COLING 2004.

[135]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[136]  Jian Su,et al.  A Joint Source-Channel Model for Machine Transliteration , 2004, ACL.

[137]  Dawn Archer,et al.  Using a semantic tagger as dictionary search tool , 2004 .

[138]  Ming Zhou,et al.  A New Approach for English-Chinese Named Entity Alignment , 2004, EMNLP.

[139]  Antonio R. Damasio,et al.  Emotions and Feelings , 2004 .

[140]  Zachary J. Mason CorMet: A Computational, Corpus-Based Conventional Metaphor Extraction System , 2004, CL.

[141]  Nicola Orio,et al.  A novel method for stemmer generation based on hidden markov models , 2003, CIKM '03.

[142]  Leah S. Larkey,et al.  Statistical transliteration for english-arabic cross language information retrieval , 2003, CIKM '03.

[143]  Victoria J. Hodge,et al.  A Comparison of Standard Spell Checking Algorithms and a Novel Binary Neural Approach , 2003, IEEE Trans. Knowl. Data Eng..

[144]  James Mayfield,et al.  Single n-gram stemming , 2003, SIGIR.

[145]  Keh-Jiann Chen,et al.  Introduction to CKIP Chinese Word Segmentation System for the First International Chinese Word Segmentation Bakeoff , 2003, SIGHAN.

[146]  Hermann Ney,et al.  A Comparative Study on Reordering Constraints in Statistical Machine Translation , 2003, ACL.

[147]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[148]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[149]  Chu-Ren Huang,et al.  Cross-lingual Portability of Semantic Relations: Bootstrapping Chinese WordNet with English WordNet Relations * , 2003 .

[150]  Leah S. Larkey,et al.  Hindi CLIR in thirty days , 2003, TALIP.

[151]  Wei Li,et al.  Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.

[152]  Kiem Hoang,et al.  POS-Tagger for English-Vietnamese Bilingual Corpus , 2003, ParallelTexts@NAACL-HLT.

[153]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[154]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[155]  Yuji Matsumoto,et al.  Japanese Named Entity Extraction with Redundant Morphological Analysis , 2003, NAACL.

[156]  Robert C. Moore Learning Translations of Named-Entity Phrases from Parallel Corpora , 2003, EACL.

[157]  Eleni Stroulia,et al.  Latent Dirichlet Allocation , 2003, The Art and Science of Analyzing Software Data.

[158]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[159]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[160]  Dien Dinh,et al.  Building a Training Corpus for Word Sense Disambiguation in English-to-Vietnamese Machine Translation , 2002, COLING 2002.

[161]  Chu-Ren Huang,et al.  Translating Lexical Semantic Relations: The First Step towards Multilingual Wordnets , 2002, COLING 2002.

[162]  Nianwen Xue,et al.  Building a Large-Scale Annotated Chinese Corpus , 2002, COLING.

[163]  David Chiang,et al.  Recovering Latent Information in Treebanks , 2002, COLING.

[164]  Keh-Jiann Chen,et al.  Unknown Word Extraction for Chinese Documents , 2002, COLING.

[165]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[166]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[167]  Susan Hunston,et al.  Corpora in Applied Linguistics , 2002 .

[168]  Michael Wilson,et al.  Edinburgh Associative Thesaurus , 2001 .

[169]  Matthew Stone,et al.  Anaphora and Discourse Structure , 2001, CL.

[170]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[171]  Ferran Plà,et al.  Clause detection using HMM , 2001, CoNLL.

[172]  Hervé Déjean,et al.  Introduction to the CoNLL-2001 shared task: clause identification , 2001, CoNLL.

[173]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[174]  B. Fredrickson The role of positive emotions in positive psychology. The broaden-and-build theory of positive emotions. , 2001, The American psychologist.

[175]  Lei Zhang,et al.  Automatic Detecting/Correcting Errors in Chinese Text by an Approximate Word-Matching Algorithm , 2000, ACL.

[176]  Eric Brill,et al.  An Improved Error Model for Noisy Channel Spelling Correction , 2000, ACL.

[177]  Derrick Higgins,et al.  Automatic Language-Specific Stemming in Information Retrieval , 2000, CLEF.

[178]  Wim Peters,et al.  Lexicalised Systematic Polysemy in WordNet , 2000, LREC.

[179]  Ted Briscoe,et al.  Lexical rules in constraint based grammars , 1999, CL.

[180]  Hermann Ney,et al.  Speech translation: coupling of recognition and translation , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[181]  E. Brown,et al.  The Medical Dictionary for Regulatory Activities (MedDRA) , 1999, Drug safety.

[182]  Rint Sybesma,et al.  The Mandarin VP , 1998 .

[183]  Kevin Knight,et al.  Translating Names and Technical Terms in Arabic Text , 1998, SEMITIC@COLING.

[184]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[185]  M. Krug,et al.  Thoughts on Grammaticalization , 1997 .

[186]  Vasileios Hatzivassiloglou,et al.  Predicting the Semantic Orientation of Adjectives , 1997, ACL.

[187]  Jason S. Chang,et al.  A Class-based Approach to Word Alignment , 1997, CL.

[188]  Stan Baggen Error correction , 1997 .

[189]  Kevin Knight,et al.  Machine Transliteration , 1997, ACL.

[190]  I. Dan Melamed,et al.  A Geometric Approach to Mapping Bitext Correspondence , 1996, EMNLP.

[191]  Keh-Jiann Chen,et al.  Segmentation Standard for Chinese Natural Language Processing , 1996, COLING.

[192]  James H. Martin Computational Approaches to Figurative Language , 1996 .

[193]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[194]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[195]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[196]  Kemal Oflazer,et al.  Spelling Correction in Agglutinative Languages , 1994, ANLP.

[197]  W. Bruce Croft,et al.  Corpus-Specific Stemming using Work Form Co-occurrence , 1994 .

[198]  Kenneth Ward Church,et al.  K-vec: A New Approach for Aligning Parallel Texts , 1994, COLING.

[199]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[200]  Stanley F. Chen,et al.  Aligning Sentences in Bilingual Corpora Using Lexical Information , 1993, ACL.

[201]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[202]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[203]  P. Ekman Facial expression and emotion. , 1993, The American psychologist.

[204]  Kenneth Ward Church,et al.  A Program for Aligning Sentences in Bilingual Corpora , 1993, CL.

[205]  Yann LeCun,et al.  Multi-Digit Recognition Using a Space Displacement Neural Network , 1991, NIPS.

[206]  Robert L. Mercer,et al.  Context based spelling correction , 1991, Inf. Process. Manag..

[207]  R. Mercer,et al.  Aligning Sentences in Parallel Corpora , 1991, ACL.

[208]  Dan Fass,et al.  met*: A Method for Discriminating Metonymy and Metaphor by Computer , 1991, CL.

[209]  Kenneth Ward Church,et al.  A Spelling Correction Program Based on a Noisy Channel Model , 1990, COLING.

[210]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[211]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[212]  Seiji Inokuchi,et al.  Sentiment extraction in music , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[213]  Andrew Ortony,et al.  The Cognitive Structure of Emotions , 1988 .

[214]  David Gries,et al.  Presenting an Algorithm to Find the Minimum Edit Distance , 1988 .

[215]  Eva I. Ejerhed,et al.  Finding Clauses in Unrestricted Text by Finitary and Stochastic Methods , 1988, ANLP.

[216]  Ellen M. Kaisse Separating phonology from syntax: a reanalysis of Pashto cliticization , 1981, Journal of Linguistics.

[217]  Michael Duane,et al.  Speech and Reading. , 1970 .

[218]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[219]  John T. Platts,et al.  A dictionary of Urdū, classical Hindī, and English , 1961 .

[220]  F A NASH,et al.  Diagnostic reasoning and the logoscope. , 1960, Lancet.

[221]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[222]  M. McCarthy The statistical approach , 1959 .

[223]  F B ROGERS,et al.  Medical Subject Headings , 1948, Nature.

[224]  A. Jersild,et al.  Feelings and Emotions. , 1929 .

[225]  Christopher Malon,et al.  NECLA at the Medical Natural Language Processing Pilot Task (MedNLP) , 2013, NTCIR.

[226]  Mizuki Morita,et al.  NTCIR-10 MedNLP Task Baseline System , 2013, NTCIR.

[227]  Tomoko Ohkuma,et al.  Overview of the NTCIR-10 MedNLP Task , 2013, NTCIR.

[228]  H. W. Xuan,et al.  An Advanced Review of Hybrid Machine Translation (HMT) , 2012 .

[229]  Mohammad Sadegh Rasooli,et al.  A Syntactic Valency Lexicon for Persian Verbs : The First Steps towards Persian Dependency Treebank , 2012 .

[230]  Nataliya,et al.  Linguistic Markers of Emotional Concept LOVE in Literary Texts , 2012 .

[231]  John Thangarajah,et al.  Coherent Topic Transition in a Conversational Agent , 2012, INTERSPEECH.

[232]  Mauro Cettolo,et al.  WIT3: Web Inventory of Transcribed and Translated Talks , 2012, EAMT.

[233]  King Abdullah,et al.  Knowledge Discovery in Al-Hadith Using Text Classification Algorithm , 2010 .

[234]  K. P. Soman,et al.  Paradigm based morphological analyzer for kannada language using machine learning approach , 2010 .

[235]  Jeffrey J. Scott,et al.  State of the Art Report: Music Emotion Recognition: A State of the Art Review , 2010, ISMIR.

[236]  P. Lewis Ethnologue : languages of the world , 2009 .

[237]  Maosong Sun,et al.  A Uyghur Morpheme Analysis Method based on Conditional Random Fields , 2009, Int. J. Asian Lang. Process..

[238]  Gerlof Bouma,et al.  Normalized (pointwise) mutual information in collocation extraction , 2009 .

[239]  Wolfgang Nejdl,et al.  Music Mood and Theme Classification - a Hybrid Approach , 2009, ISMIR.

[240]  Jae Sung Lee,et al.  English to Korean Statistical Transliteration for Information Retrieval , 2008 .

[241]  V. J. Leffa Clause Processing in Complex Sentences , 2008 .

[242]  Mert Bay,et al.  The 2007 MIREX Audio Mood Classification Task: Lessons Learned , 2008, ISMIR.

[243]  Hsin-Hsi Chen,et al.  Overview of Multilingual Opinion Analysis Task at NTCIR-7 , 2008, NTCIR.

[244]  S. Hussain,et al.  A Hybrid Approach for Urdu Spell Checking , 2007 .

[245]  Mohammed N. Al-Kabi,et al.  A COMPARATIVE STUDY OF THE EFFICIENCY OF DIFFERENT MEASURES TO CLASSIFY ARABIC TEXT , 2007 .

[246]  Ascander Dost,et al.  A Domain-Based Approach to 2P Clitics in Pashto , 2007 .

[247]  Qiang Dong,et al.  Hownet And The Computation Of Meaning , 2006 .

[248]  Annie Zaenen,et al.  Contextual Valence Shifters , 2006, Computing Attitude and Affect in Text.

[249]  Nicola Ferro,et al.  A probabilistic model for stemmer generation , 2005, Inf. Process. Manag..

[250]  Andrew McCallum,et al.  Gene Prediction with Conditional Random Fields , 2005 .

[251]  S. Anderson Aspects of the Theory of Clitics , 2005 .

[252]  Stefanie Dipper,et al.  XML-based Stand-off Representation and Exploitation of Multi-Level Linguistic Annotation , 2005, Berliner XML Tage.

[253]  Anthony McEnery,et al.  A large semantic lexicon for corpus annotation. , 2005 .

[254]  Anke Lüdeling,et al.  Multi-level error annotation in learner corpora , 2005 .

[255]  Martin Wynne,et al.  Developing Linguistic Corpora: a Guide to Good Practice , 2005 .

[256]  Yogendra P. Yadava,et al.  Contemporary issues in Nepalese linguistics , 2005 .

[257]  Jordan L. Boyd-Graber,et al.  Adding dense, weighted connections to WordNet , 2005 .

[258]  Jerome R. Bellegarda,et al.  Statistical language model adaptation: review and perspectives , 2004, Speech Commun..

[259]  Ananthakrishnan Ramanathan,et al.  A Lightweight Stemmer for Hindi , 2003 .

[260]  Georgiana Puscasu,et al.  A Multilingual Method for Clause Splitting , 2003 .

[261]  Johan Bos,et al.  Automatic Multi-Layer Corpus Annotation for Evaluation Question Answering Methods: CBC4Kids , 2003, LINC@EACL.

[262]  Alan L. Rector,et al.  OpenGALEN: Open Source Medical Terminology and Tools , 2003, AMIA.

[263]  L. Lamel,et al.  Multi-layer Dialogue Annotation for Automated Multilingual Customer Service , 2003 .

[264]  Nicola Ferro,et al.  University of Padua at CLEF 2002: Experiments to Evaluate a Statistical Stemming Algorithm , 2002, CLEF.

[265]  Iva Stuchlíková,et al.  Základy psychologie emocí , 2002 .

[266]  Eiichiro Sumita,et al.  Toward a Broad-coverage Bilingual Corpus for Speech Translation of Travel Conversations in the Real World , 2002, LREC.

[267]  Kent D. Peterson Positive or Negative. , 2002 .

[268]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[269]  Constantin Orasan,et al.  A hybrid method for clause splitting in unrestricted English texts , 2000 .

[270]  Taylor Roberts,et al.  Clitics and agreement , 2000 .

[271]  Nianwen Xue,et al.  Developing Guidelines and Ensuring Consistency for Chinese Text Annotation , 2000, LREC.

[272]  Geoffrey Leech,et al.  Representation and annotation of dialogue. , 2000 .

[273]  Quian E Gao Argument Structure, HPSG, and Chinese Grammar , 2000 .

[274]  Anthony Kroch,et al.  The Bracketing Guidelines for the Penn Chinese Treebank (3.0) , 2000 .

[275]  Chung Yong Lim,et al.  A Case Study on Inter-Annotator Agreement for Word Sense Disambiguation , 1999 .

[276]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[277]  A. Dorghan.Mohammed,et al.  International Journal Of Continuing Engineering Education And Life-Long Learning , 1999 .

[278]  Keh-Jiann Chen,et al.  Unknown Word Detection for Chinese by a Corpus-based Learning Method , 1998, ROCLING/IJCLCLP.

[279]  Vladimir Cherkassky,et al.  Statistical learning theory , 1998 .

[280]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[281]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[282]  Kurt W. Fischer,et al.  Self-conscious emotions: The psychology of shame, guilt, embarrassment, and pride. , 1995 .

[283]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[284]  QIAN GAO,et al.  Chinese NP structure , 1994 .

[285]  B. Carpenter,et al.  Book Reviews: The Logic of Typed Feature Structures , 1993, CL.

[286]  Roger K. Moore Computer Speech and Language , 1986 .

[287]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[288]  Albert Sydney Hornby,et al.  牛津高阶英汉双解词典 = Oxford advanced learner's English-Chinese dictionary , 1984 .

[289]  J. Aarts,et al.  Corpus linguistics : recent developments in the use of computer corpora in English language research , 1984 .

[290]  Habibullah Tegey,et al.  The grammar of clitics : evidence from Pashto and other languages , 1977 .

[291]  Michael Halliday,et al.  Cohesion in English , 1976 .

[292]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[293]  John T. Platts,et al.  A Grammar of the Hindustani or Urdu Language , 1874 .