Machine Translation Using Semantic Web Technologies: A Survey

A large number of machine translation approaches have recently been developed to facilitate the fluid migration of content across languages. However, the literature suggests that many obstacles must still be dealt with to achieve better automatic translations. One of these obstacles is lexical and syntactic ambiguity. A promising way of overcoming this problem is using Semantic Web technologies. This article presents the results of a systematic review of machine translation approaches that rely on Semantic Web technologies for translating texts. Overall, our survey suggests that while Semantic Web technologies can enhance the quality of machine translation outputs for various problems, the combination of both is still in its infancy.

[1]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[2]  Philipp Cimiano,et al.  Ontology Learning from Text: Methods, Evaluation and Applications , 2005 .

[3]  Maja Popovic,et al.  Class error rates for evaluation of machine translation output , 2012, WMT@NAACL-HLT.

[4]  Steve Legrand,et al.  A Hybrid Approach to Word Sense Disambiguation : Neural Clustering with Class Labeling , 2004 .

[5]  Carlo Zaniolo,et al.  Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment , 2016, IJCAI.

[6]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[7]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[8]  Ricardo Choren,et al.  Using Ontology-Based Context in the Portuguese-English Translation of Homographs in Textual Dialogues , 2015, International Journal of Artificial Intelligence & Applications.

[9]  Maja Popovic,et al.  chrF: character n-gram F-score for automatic MT evaluation , 2015, WMT@EMNLP.

[10]  Qun Liu,et al.  Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search , 2017, ACL.

[11]  Roberto Navigli,et al.  Multilingual Word Sense Disambiguation and Entity Linking for Everybody , 2014, International Semantic Web Conference.

[12]  Marine Carpuat,et al.  Improving Statistical Machine Translation Using Word Sense Disambiguation , 2007, EMNLP.

[13]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[14]  Scott Farrar,et al.  A linguistic ontology for the semantic web , 2003 .

[15]  Gilles Sérasset,et al.  DBnary: Wiktionary as a Lemon-based multilingual lexical resource in RDF , 2015, Semantic Web.

[16]  S. SantoshKumarT.,et al.  Word Sense Disambiguation Using Semantic Web for Tamil to English Statistical Machine Translation , 2016 .

[17]  Andy Way,et al.  Using BabelNet to Improve OOV Coverage in SMT , 2016, LREC.

[18]  Timm Heuss,et al.  Semantic Web based Machine Translation , 2012, ESIRMT/HyTra@EACL.

[19]  Paul Buitelaar,et al.  Domain adaptation for ontology localization , 2016, J. Web Semant..

[20]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[21]  Felix Sasaki,et al.  How to configure statistical machine translation with linked open data resources , 2016, TC.

[22]  Cristina Vertan,et al.  Language Resources for the Semantic Web – perspectives for Machine Translation – , 2004 .

[23]  K. V. Prasad Principles of Digital Communication Systems and Computer Networks (Electrical and Computer Engineering Series) , 2004 .

[24]  Christopher D. Manning Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics? , 2011, CICLing.

[25]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[26]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[27]  Barbara Kitchenham,et al.  Procedures for Performing Systematic Reviews , 2004 .

[28]  Silvio Ceccato Automatic translation of languages , 1964, Inf. Storage Retr..

[29]  O. Lozynska,et al.  Information technology for Ukrainian Sign Language translation based on ontologies , 2015 .

[30]  D. W. Barron Machine Translation , 1968, Nature.

[31]  Jens Lehmann,et al.  DL-Learner: Learning Concepts in Description Logics , 2009, J. Mach. Learn. Res..

[32]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[33]  Philipp Koehn,et al.  Findings of the 2017 Conference on Machine Translation (WMT17) , 2017, WMT.

[34]  Lucia Specia,et al.  Guiding Neural Machine Translation Decoding with External Knowledge , 2017, WMT.

[35]  Yehoshua Bar-Hillel,et al.  The Present Status of Automatic Translation of Languages , 1960, Adv. Comput..

[36]  Marta R. Costa-jussà,et al.  Study and Comparison of Rule-Based and Statistical Catalan-Spanish Machine Translation Systems , 2012, Comput. Informatics.

[37]  Maja Popovic,et al.  chrF deconstructed: beta parameters and n-gram weights , 2016, WMT.

[38]  Natalia Elita,et al.  A First Step in Integrating an EBMT into the Semantic Web , 2005, MTSUMMIT.

[39]  Marta R. Costa-jussà,et al.  Statistical machine translation enhancements through linguistic levels: A survey , 2014, CSUR.

[40]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing , 2000 .

[41]  Heiko Paulheim,et al.  RDF2Vec: RDF Graph Embeddings for Data Mining , 2016, SEMWEB.

[42]  Philipp Koehn,et al.  Six Challenges for Neural Machine Translation , 2017, NMT@ACL.

[43]  Jens Lehmann,et al.  Integrating NLP Using Linked Data , 2013, SEMWEB.

[45]  Ibrahim Ahmed Al-Baltah,et al.  Towards an Arabic-English Machine-Translation Based on Semantic Web , 2017, ArXiv.

[46]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[47]  Peng Li A Survey of Machine Translation Methods , 2013 .

[48]  José A. R. Fonollosa,et al.  Latest trends in hybrid machine translation and its applications , 2015, Comput. Speech Lang..

[49]  Hend S. Al-Khalifa,et al.  A proposed semantic machine translation system for translating Arabic text to Arabic sign language , 2011 .

[50]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[51]  Anna Freud,et al.  Grammatical Framework Programming With Multilingual Grammars , 2016 .

[52]  Tore Dybå,et al.  Applying Systematic Reviews to Diverse Study Types: An Experience Report , 2007, First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007).

[53]  Aldo Gangemi,et al.  The OntoWordNet Project: Extension and Axiomatization of Conceptual Relations in WordNet , 2003, OTM.

[54]  Zdenek Zdrahal,et al.  Facilitating cross-language retrieval and machine translation by multilingual domain ontologies , 2010 .

[55]  Axel-Cyrille Ngonga Ngomo,et al.  LIDIOMS: A Multilingual Linked Idioms Data Set , 2018, LREC.

[56]  Leila Jemni Ben Ayed,et al.  Graphical UML View from Extended Backus-Naur Form Grammars , 2006, Sixth IEEE International Conference on Advanced Learning Technologies (ICALT'06).

[57]  Paul Buitelaar,et al.  Translating the FINREP Taxonomy using a Domain-specific Corpus , 2013, MTSUMMIT.

[58]  Aldo Gangemi,et al.  A Comparison of Knowledge Extraction Tools for the Semantic Web , 2013, ESWC.

[59]  Jonathan Slocum,et al.  A Survey of Machine Translation: Its History, Current Status and Future Prospects , 1985, CL.

[60]  Mark Steedman,et al.  Example Selection for Bootstrapping Statistical Parsers , 2003, NAACL.

[61]  Jens Lehmann,et al.  Quality assessment for Linked Data: A Survey , 2015, Semantic Web.

[62]  Simone Paolo Ponzetto,et al.  Multilingual WSD with Just a Few Lines of Code: the BabelNet API , 2012, ACL.

[63]  John G. Breslin,et al.  Using the Semantic Web for linking and reusing data across Web 2.0 communities , 2008, J. Web Semant..

[64]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[65]  Christian Chiarcos,et al.  OLiA - Ontologies of Linguistic Annotation , 2015, Semantic Web.

[66]  Paul Buitelaar,et al.  Ontology Label Translation , 2013, NAACL.

[67]  Mohamed Amine Chéragui,et al.  Theoretical Overview of Machine translation , 2012, ICWIT.

[68]  Arianna Bisazza,et al.  Surveys: A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena , 2015, CL.

[69]  Leila Jemni Ben Ayed,et al.  Graphical UML View from Extended Backus-Naur Form Grammars , 2006 .

[70]  Cclrc Rutherford,et al.  SKOS Core: Simple Knowledge Organisation for the Web , 2006 .

[71]  G. Thurmair Comparing different architectures of hybrid Machine Translation systems , 2009, MTSUMMIT.

[72]  Hervé Blanchon,et al.  Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources? , 2016, COLING.

[73]  Sunitha Abburu,et al.  A Survey on Ontology Reasoners and Comparison , 2012 .

[74]  Sören Auer,et al.  AGDISTIS - Graph-Based Disambiguation of Named Entities Using Linked Data , 2014, International Semantic Web Conference.

[75]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[76]  Dimitris Kontokostas,et al.  Multilingual linked data patterns , 2015, Semantic Web.

[77]  Gregor Thurmair,et al.  Comparing Rule-based and Statistical MT Output , 2006 .

[78]  Bogdan Babych,et al.  Extending the BLEU MT Evaluation Method with Frequency Weightings , 2004, ACL.

[79]  Axel-Cyrille Ngonga Ngomo,et al.  MAG: A Multilingual, Knowledge-base Agnostic and Deterministic Entity Linking Approach , 2017, K-CAP.

[80]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.

[81]  G Stix,et al.  The mice that warred. , 2001, Scientific American.

[82]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[83]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[84]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[85]  Ronald M. Kaplan,et al.  The Formal Architecture of Lexical-Functional Grammar , 1989, J. Inf. Sci. Eng..

[86]  Harold L. Somers,et al.  An introduction to machine translation , 1992 .

[87]  Guillaume Lample,et al.  Unsupervised Machine Translation Using Monolingual Corpora Only , 2017, ICLR.

[88]  Matt Post,et al.  Beyond bitext: Five open problems in machine translation , 2013 .

[89]  Mathias Winther Madsen,et al.  The Limits of Machine Translation , 2009 .

[90]  Gregory Grefenstette,et al.  Estimation of English and non-English Language Use on the WWW , 2000, RIAO.

[91]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[92]  Pamela W. Jordan,et al.  A survey of current paradigms in machine translation , 1999, Adv. Comput..

[93]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[94]  Eric Wehrli,et al.  NERITS - A Machine Translation Mashup System Using Wikimeta and DBpedia , 2013, ESWC.

[95]  Kiril Ivanov Simov,et al.  Language Technology for eLearning , 2006, EC-TEL.

[96]  Christopher D. Manning,et al.  Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models , 2016, ACL.

[97]  Philipp Cimiano,et al.  Linking Lexical Resources and Ontologies on the Semantic Web with Lemon , 2011, ESWC.

[98]  Hervé Blanchon,et al.  METEOR for multiple target languages using DBnary , 2015, MTSUMMIT.

[99]  Marta R. Costa-jussà How much hybridization does machine translation Need? , 2015, J. Assoc. Inf. Sci. Technol..

[100]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[101]  Philipp Cimiano,et al.  Mining translations from the web of open linked data , 2013, SWAIE@RANLP.

[102]  Shi Feng,et al.  Knowledge-Based Semantic Embedding for Machine Translation , 2016, ACL.

[103]  Yoshua Bengio,et al.  A Character-level Decoder without Explicit Segmentation for Neural Machine Translation , 2016, ACL.

[104]  James H. Martin,et al.  Introduction to Natural Language Processing , 2019, Hands-on Question Answering Systems with BERT.

[105]  Ho-Jin Choi,et al.  Syntactic and semantic English-Korean Machine Translation using ontology , 2009, 2009 11th International Conference on Advanced Communication Technology.

[106]  Paul Buitelaar,et al.  Translating Domain-Specific Expressions in Knowledge Bases with Neural Machine Translation , 2017, ArXiv.

[107]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[108]  Asunción Gómez-Pérez,et al.  Challenges for the multilingual Web of Data , 2012, J. Web Semant..

[109]  Alessandro Mazzei,et al.  An Ontology Based Architecture for Translation , 2011, IWCS.

[110]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[111]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA Statement , 2009, BMJ : British Medical Journal.

[112]  Timm Heuss Lessons learned (and questions raised) from an interdisciplinary Machine Translation approach , 2013 .

[113]  Dekai Wu,et al.  Improving evaluation and optimization of MT systems against MEANT , 2015, WMT@EMNLP.

[114]  Axel-Cyrille Ngonga Ngomo,et al.  Ensemble Learning of Named Entity Recognition Algorithms using Multilayer Perceptron for the Multilingual Web of Data , 2017, K-CAP.

[115]  Huilin Wang,et al.  Research on ontology-driven Chinese-English machine translation , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[116]  Kiril Ivanov Simov,et al.  Towards Semantic-based Hybrid Machine Translation between Bulgarian and English , 2016, SedMT@NAACL-HLT.

[117]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[118]  Maja Popovic,et al.  chrF++: words helping character n-grams , 2017, WMT.

[119]  Jean Carletta,et al.  Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization , 2005, ACL 2005.

[120]  Daniel Pimienta Twelve years of measuring linguistic diversity in the Internet: balance and perspectives , 2009 .

[121]  Dekai Wu,et al.  MEANT: An inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility based on semantic roles , 2011, ACL.

[122]  Julia Bosque-Gil,et al.  Applying the OntoLex Model to a Multilingual Terminological Resource , 2015, MSW@ESWC.

[124]  Felix Sasaki,et al.  Improving Machine Translation through Linked Data , 2017, Prague Bull. Math. Linguistics.