Applying dynamic Bayesian networks in transliteration detection and generation

Peter Nabende promoveert op methoden die programma’s voor automatisch vertalen kunnen verbeteren. Hij onderzocht twee systemen voor het genereren en vergelijken van transcripties: een DBN-model (Dynamische Bayesiaanse Netwerken) waarin Pair Hidden Markovmodellen zijn geimplementeerd en een DBN-model dat op transductie is gebaseerd. Nabende onderzocht het effect van verschillende DBN-parameters op de kwaliteit van de geproduceerde transcripties. Voor de evaluatie van de DBN-modellen gebruikte hij standaard dataverzamelingen van elf taalparen: Engels-Arabisch, Engels-Bengaals, Engels-Chinees, Engels-Duits, Engels-Frans, Engels-Hindi, Engels-Kannada, Engels-Nederlands, Engels-Russisch, Engels-Tamil en Engels-Thai. Tijdens het onderzoek probeerde hij om verschillende modellen te combineren. Dat bleek een goed resultaat op te leveren.

[1]  Monika Zempleni,et al.  Functional imaging of the hemispheric contribution to language processing , 2006 .

[2]  Kim Sauter,et al.  Transfer and access to universal grammar in adult second language acquisition , 2002 .

[3]  Holger Christian Hopp,et al.  Ultimate attainment at the interfaces in second language acquisition : grammar and processing , 2007 .

[4]  Hai Zhao,et al.  Reranking with Multiple Features for Better Transliteration , 2010, NEWS@ACL.

[5]  Tal Caspi,et al.  A dynamic perspective on second language development , 2010 .

[6]  Krister Lindén Multilingual modeling of cross-lingual spelling variants , 2006, Information Retrieval.

[7]  Michael A. Covington,et al.  An Algorithm to Align Words for Historical Comparison , 1996, Comput. Linguistics.

[8]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[9]  Vojkan Mihajlovic,et al.  Dynamic Bayesian Networks: A State of the Art , 2001 .

[10]  Leonoor Johanneke van der Beek,et al.  Topics in corpus-based Dutch syntax , 2005 .

[11]  Mi-Young Kim,et al.  Transliteration Generation and Mining with Limited Training Resources , 2010, NEWS@ACL.

[12]  Falk Scholer,et al.  Machine transliteration survey , 2011, ACM Comput. Surv..

[13]  Ian R. Lane,et al.  A Log-Linear Block Transliteration Model based on Bi-Stream HMMs , 2007, HLT-NAACL.

[14]  A. Mullen,et al.  An investigation into compositional features and feature merging for maximum entropy-based parse selection , 2002 .

[15]  Peter N. Yianilos,et al.  Learning String-Edit Distance , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Kevin Knight,et al.  Machine Transliteration , 1997, CL.

[17]  Yaser Al-Onaizan,et al.  Machine Transliteration of Names in Arabic Texts , 2002, SEMITIC@ACL.

[18]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[19]  Zhang Min,et al.  Direct orthographical mapping for machine transliteration , 2004, COLING 2004.

[20]  Peter Nabende Mining Transliterations from Wikipedia Using Pair HMMs , 2010, NEWS@ACL.

[21]  Rita Landeweerd,et al.  Discourse semantics of perspective and temporal structure , 1998 .

[22]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[23]  R. Bastiaanse,et al.  Clitic production in Italian agrammatism , 2005, Brain and Language.

[24]  Jason S. Chang,et al.  Acquisition of English-Chinese Transliterated Word Pairs from Parallel-Aligned Texts using a Statistical Machine Transliteration Model , 2003, ParallelTexts@NAACL-HLT.

[25]  J. Jensen Sur les fonctions convexes et les inégalités entre les valeurs moyennes , 1906 .

[26]  Peter Nabende,et al.  Transliteration System Using Pair HMM with Weighted FSTs , 2009, NEWS@IJCNLP.

[27]  Xavier Boyen,et al.  Discovering the Hidden Structure of Complex Dynamic Systems , 1999, UAI.

[28]  Jian Su,et al.  A Joint Source-Channel Model for Machine Transliteration , 2004, ACL.

[29]  John Nerbonne,et al.  Inducing Sound Segment Differences Using Pair Hidden Markov Models , 2007, SIGMORPHON.

[30]  Leah S. Larkey,et al.  Statistical transliteration for english-arabic cross language information retrieval , 2003, CIKM '03.

[31]  H. D. Swart,et al.  Adverbs of quantification : a generalized quantifier approach , 1993 .

[32]  Jori Mur,et al.  Off-line answer extraction for question answering , 2008 .

[33]  Antal van den Bosch,et al.  Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics , 2007 .

[34]  Pushpak Bhattacharyya,et al.  PR + RQ ALMOST EQUAL TO PQ: Transliteration Mining Using Bridge Language , 2010, AAAI.

[35]  Stasinos Konstantopoulos Using ILP to learn local linguistic structures , 2003 .

[36]  Berlin Chen,et al.  Generating phonetic cognates to handle named entities in English-Chinese cross-language spoken document retrieval , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[37]  Geoffrey Zweig,et al.  Probabilistic modeling with Bayesian networks for automatic speech recognition , 1998, ICSLP.

[38]  Grzegorz Kondrak,et al.  DirecTL: a Language Independent Approach to Transliteration , 2009, NEWS@IJCNLP.

[39]  Geoffrey Zweig,et al.  The graphical models toolkit: An open source software system for speech and time-series processing , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[40]  Haizhou Li,et al.  A phonetic similarity model for automatic extraction of transliteration pairs , 2007, TALIP.

[41]  Wilbert Jan Heeringa Measuring dialect pronunciation differences using Levenshtein distance , 2004 .

[42]  Haizhou Li,et al.  Report of NEWS 2010 Transliteration Mining Shared Task , 2010, NEWS@ACL.

[43]  Siti Mina Tamah,et al.  Student interaction in the implementation of the jigsaw technique in language teaching , 2011 .

[44]  Sivaji Bandyopadhyay,et al.  English to Indian Languages Machine Transliteration System at NEWS 2010 , 2010, NEWS@ACL.

[45]  Jae Sung Lee,et al.  English to Korean Statistical Transliteration for Information Retrieval , 2008 .

[46]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[47]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[48]  Xiaodong He Using Word-Dependent Transition Models in HMM-Based Word Alignment for Statistical Machine Translation , 2007, WMT@ACL.

[49]  Tuba Yarbay Duman,et al.  Turkish agrammatic aphasia : word order, time reference and case , 2009 .

[50]  Arianus Pieter Versloot,et al.  Mechanisms of Language Change: Vowel Reduction in 15th Century West Frisian , 2008 .

[51]  Manoj Kumar Chinnakotla,et al.  Transliteration for Resource-Scarce Languages , 2010, TALIP.

[52]  E. H. Klein-van der Laaken,et al.  Adverbs of Degree in Dutch , 1997 .

[53]  G. Arfken Mathematical Methods for Physicists , 1967 .

[54]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[55]  Ana Arribas-Gil,et al.  Parameter Estimation in Pair‐hidden Markov Models , 2005, math/0509280.

[56]  Eric Hoekstra,et al.  Licensing conditions on phrase structure , 1991 .

[57]  Sanjeev Khudanpur,et al.  Transliteration of Proper Names in Cross-Lingual Information Retrieval , 2003, NER@ACL.

[58]  Key-Sun Choi,et al.  An English-Korean Transliteration Model Using Pronunciation and Contextual Rules , 2002, COLING.

[59]  L. M. Bosveld-de Smet,et al.  On mass and plural quantification: the case of French des/du NPs , 2001 .

[60]  Kevin Knight,et al.  Translating Names and Technical Terms in Arabic Text , 1998, SEMITIC@COLING.

[61]  T. Van de Cruys,et al.  Mining for meaning: the extraction of lexico-semantic knowledge from text , 2010 .

[62]  Marie Louise Elizabeth van der Plas,et al.  Automatic lexico-semantic acquisition for question answering , 2008 .

[63]  Andrew Freeman,et al.  Cross Linguistic Name Matching in English and Arabic , 2006, NAACL.

[64]  Mettina Jolanda Arnoldina Veenstra,et al.  Formalizing the minimalist program , 1998 .

[65]  Sung-Hyon Myaeng,et al.  Automatic identification and back-transliteration of foreign words for information retrieval , 1999, Inf. Process. Manag..

[66]  Jong-Hoon Oh,et al.  Machine Transliteration using Target-Language Grapheme and Phoneme: Multi-engine Transliteration Approach , 2009, NEWS@IJCNLP.

[67]  S. Schoof,et al.  An HPSG account of nonfinite verbal complements in Latin , 2004 .

[68]  Jörg Tiedemann,et al.  Pair Hidden Markov Model for Named Entity Matching , 2008, SCSS.

[69]  Karim Filali,et al.  A Dynamic Bayesian Framework to Model Context and Memory in Edit Distance Learning: An Application to Pronunciation Classification , 2005, ACL.

[70]  Grzegorz Kondrak,et al.  Evaluation of Several Phonetic Similarity Algorithms on the Task of Cognate Identification , 2006 .

[71]  Anil Kumar Singh,et al.  A More Discerning and Adaptable Multilingual Transliteration Mechanism for Indian Languages , 2008, IJCNLP.

[72]  Jason S. Chang,et al.  Extraction of Name and Transliteration in Monolingual and Parallel Corpora , 2004, AMTA.

[73]  Gijsbert Bos,et al.  Rapid user interface development with the script language Gist , 1993 .

[74]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[75]  Janneke ter Beek,et al.  Restructuring and infinitival complements in Dutch , 2008 .

[76]  Tao Tao,et al.  Named Entity Transliteration with Comparable Corpora , 2006, ACL.

[77]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[78]  Dieuwke de Goede,et al.  Verbs in spoken sentence processing : Unraveling the activation pattern of the matrix verb pattern of the matrix verb , 2006 .

[79]  Robert George Shackleton,et al.  Quantitative assessment of English-American speech relationships , 2010 .

[80]  Grzegorz Kondrak,et al.  Computing Word Similarity and Identifying Cognates with Pair Hidden Markov Models , 2005, CoNLL.

[81]  Mehdi Mohammadi,et al.  Building Bilingual Parallel Corpora Based on Wikipedia , 2010, 2010 Second International Conference on Computer Engineering and Applications.

[82]  Erik-Jan Smits,et al.  Acquiring quantification. How children use semantics and pragmatics to constrain meaning , 2004 .

[83]  K. Saravanan,et al.  MINT: A Method for Effective and Scalable Mining of Named Entity Transliterations from Large Comparable Corpora , 2009, EACL.

[84]  W. Jansen Laryngeal contrast and phonetic voicing : a laboratory phonology approach to English, Hungarian, and Dutch , 2004 .

[85]  Erik Fajoen Tjong-Kim-Sang Machine Learning of Phonotactics , 1998 .

[86]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[87]  Eiichiro Sumita,et al.  Transliteration Using a Phrase-Based Statistical Machine Translation System to Re-Score the Output of a Joint Multigram Model , 2010, NEWS@ACL.

[88]  Eunok Paek,et al.  An English to Korean Transliteration Model of Extended Markov Window , 2000, COLING.

[89]  In-Ho Kang,et al.  English-to-Korean Transliteration using Multiple Unbounded Overlapping Phoneme Chunks , 2000, COLING.

[90]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[91]  Wei Gao,et al.  Phoneme-Based Transliteration of Foreign Names for OOV Problem , 2004, IJCNLP.

[92]  Edith Kaan,et al.  Processing subject-object ambiguities in Dutch , 1997 .

[93]  Alfa-Informatica Mining Transliterations from Wikipedia using Dynamic Bayesian Networks , 2011 .

[94]  Rob Koeling,et al.  Dialogue-based disambiguation: using dialogue status to improve speech understanding , 2002 .

[95]  A. Schüppert Origin of asymmetry. Mutual intelligibility of spoken Danish and Swedish , 2011 .

[96]  Gosse Bouma,et al.  Cross-lingual Alignment and Completion of Wikipedia Templates , 2009 .

[97]  Petra Hendriks,et al.  Comparatives and Categorial Grammar , 1995 .

[98]  J. T. de Jong,et al.  The case of bound pronouns in peripheral Romance , 1996 .

[99]  Ismail Fahmi,et al.  Automatic term and relation extraction for medical question answering system , 2009 .

[100]  K. Colman Behavioral and neuroimaging studies on language processing in Dutch speakers with Parkinson's disease , 2011 .

[101]  Peter Nabende,et al.  Applying a Dynamic Bayesian Network Framework to Transliteration Identification , 2010, LREC.

[102]  Abraham Kandel,et al.  Data Mining in Time Series Database , 2004 .

[103]  Shuly Wintner,et al.  A General Method for Creating a Bilingual Transliteration Dictionary , 2010, LREC.

[104]  Sean Borman,et al.  The Expectation Maximization Algorithm A short tutorial , 2006 .

[105]  K. Yoshioka,et al.  Linguistic and gestural introduction and tracking of referents in L1 and L2 discourse , 2005 .

[106]  Tamás Biró,et al.  Finding the right words: implementing optimality theory with simulated annealing , 2006 .

[107]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[108]  Key-Sun Choi,et al.  Automatic Transliteration and Back-transliteration by Decision Tree Learning , 2000, LREC.

[109]  Haizhou Li,et al.  Report of NEWS 2010 Transliteration Generation Shared Task , 2010, NEWS@ACL.

[110]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[111]  Marjolijn Verspoor,et al.  Frequency and function in WH question acquisition. A usage-based case study of German L1 acquisition , 2005 .

[112]  Dmitry Zelenko,et al.  Discriminative Methods for Transliteration , 2006, EMNLP.

[113]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[114]  Falk Scholer,et al.  Collapsed Consonant and Vowel Models: New Approaches for English-Persian Transliteration and Back-Transliteration , 2007, ACL.

[115]  Zoubin Ghahramani,et al.  An Introduction to Hidden Markov Models and Bayesian Networks , 2001, Int. J. Pattern Recognit. Artif. Intell..

[116]  Muhammad Ghulam Abbas Malik,et al.  Punjabi Machine Transliteration , 2006, ACL.

[117]  Jian Su,et al.  Direct Orthographical Mapping for Machine Transliteration , 2004, COLING.

[118]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[119]  白石 英才,et al.  Topics in nivkh phonology , 2006 .

[120]  Laura Sabourin Grammatical Gender and Second Language Processing , 2003 .

[121]  H. Isahara,et al.  A Comparison of Different Machine Transliteration Models , 2006, J. Artif. Intell. Res..

[122]  Haizhou Li,et al.  Whitepaper of NEWS 2010 Shared Task on Transliteration Mining , 2010, NEWS@ACL.

[123]  Sible Andringa,et al.  Form-focused instruction and the development of second language proficiency , 2005 .

[124]  Karin M. Verspoor,et al.  Automatic English-Chinese name transliteration for development of multilingual resources , 1998, ACL.

[125]  Sake Jager,et al.  Towards ICT-integrated language learning: Developing an implementation framework in terms of Pedagogy, Technology and Environment , 2009 .

[126]  Dan Roth,et al.  Weakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora , 2006, ACL.

[127]  Haizhou Li,et al.  Semantic Transliteration of Personal Names , 2007, ACL.

[128]  Marjolein Deunk,et al.  Discourse practices in preschool , 2009 .

[129]  Mauro Cettolo,et al.  IRSTLM: an open source toolkit for handling large scale language models , 2008, INTERSPEECH.

[130]  Yilu Zhou Maximum n-Gram HMM-based Name Transliteration: Experiment in NEWS 2009 on English-Chinese Corpus , 2009, NEWS@IJCNLP.

[131]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[132]  A. Kawtrakul,et al.  Backward transliteration for Thai document retrieval , 1998, IEEE. APCCAS 1998. 1998 IEEE Asia-Pacific Conference on Circuits and Systems. Microelectronics and Integrating Systems. Proceedings (Cat. No.98EX242).

[133]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[134]  J. D. Jong,et al.  Specific language impairment in Dutch , 1999 .

[135]  Francisco Dellatorre Borges,et al.  Parse selection with Support Vector Machines , 2010 .

[136]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[137]  Eiichiro Sumita,et al.  Phrase-based Machine Transliteration , 2008, IJCNLP.

[138]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[139]  Mark Kas,et al.  Essays on Boolean Functions and Negative Polarity , 1993 .

[140]  Marjolein Deunk,et al.  Discourse practices in preschool: young children's participation in everyday classroom activities , 2009 .

[141]  R. Jonkers,et al.  Comprehension and production of verbs in aphasic speakers , 1998 .

[142]  Gerlof Bouma,et al.  Starting a sentence in Dutch : a corpus study of subject- and object-fronting , 2008 .

[143]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[144]  Xiaoyan Xu,et al.  English language attrition and retention in Chinese and Dutch university students , 2010 .

[145]  Sivaji Bandyopadhyay,et al.  A Modified Joint Source-Channel Model for Transliteration , 2006, ACL.

[146]  Joanneke Prenger,et al.  Taal telt!: Een onderzoek naar de rol van taalvaardigheid en tekstbegrip in het realistisch wiskundeonderwijs , 2005 .

[147]  Mansur Arbabi,et al.  Algorithms for Arabic name transliteration , 1994, IBM J. Res. Dev..

[148]  Grzegorz Kondrak,et al.  Bootstrapping a Stochastic Transducer for Arabic-English Transliteration Extraction , 2007, ACL.

[149]  L. J. Savage,et al.  The Foundations of Statistics , 1955 .

[150]  Dirk Barend den Ouden,et al.  Phonology in aphasia: syllables and segments in level-specific deficits , 2002 .

[151]  Judea Pearl,et al.  Bayesian Networks , 1998, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[152]  Pushpak Bhattacharyya,et al.  Hindi Urdu Machine Transliteration using Finite-State Transducers , 2008, COLING.

[153]  Takahiro Hara,et al.  An Approach for Extracting Bilingual Terminology from Wikipedia , 2008, DASFAA.

[154]  Tanja Gaustad,et al.  Linguistic knowledge and word sense disambiguation , 2004 .

[155]  Peter Nabende Evaluation of Dynamic Bayesian Network models for Entity Name Transliteration , 2009 .

[156]  Keh-Jiann Chen,et al.  Unknown Word Detection for Chinese by a Corpus-based Learning Method , 1998, ROCLING/IJCLCLP.

[157]  J. Nerbonne,et al.  University of Groningen An Acoustic Analysis of Vowel Pronunciation in Swedish Dialects Leinonen , 2010 .

[158]  Amgad Madkour,et al.  Language Independent Transliteration Mining System Using Finite State Automata Framework , 2010, NEWS@ACL.

[159]  Jacqueline F. van Kruiningen,et al.  Onderwijsontwerp als conversatie: probleemoplossing in interprofessioneel overleg , 2010 .

[160]  Maria Trofimova,et al.  Case assignment by prepositions in Russian aphasia , 2009 .

[161]  Joop Houtman,et al.  Coordination and constituency : a study in categorial grammar , 1994 .

[162]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[163]  Falk Scholer,et al.  English to Persian Transliteration , 2006, SPIRE.

[164]  A. Giannakidou The Landscape of Polarity Items , 1997 .

[165]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[166]  Kareem Darwish,et al.  Transliteration Mining with Phonetic Conflation and Iterative Training , 2010, NEWS@ACL.

[167]  Maartje Schreuder,et al.  Prosodic processes in language and music , 2006 .

[168]  Naoto Kato,et al.  Transliteration Considering Context Information based on the Maximum Entropy Method , 2003 .

[169]  Robbert Prins,et al.  Finite-state pre-processing for natural language analysis , 2005 .

[170]  Judith Rispens,et al.  Syntactic and phonological processing in developmental dyslexia , 2004 .