Introduction to the Special Issue on Cross-Language Algorithms and Applications

With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and effective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.

[1]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[2]  Harold L. Somers,et al.  An introduction to machine translation , 1992 .

[3]  Mikel L. Forcada,et al.  Inferring Shallow-Transfer Machine Translation Rules from Small Parallel Corpora , 2014, J. Artif. Intell. Res..

[4]  Ying Zhang,et al.  Dublin City University at CLEF 2007: Cross-Language Speech Retrieval Experiments , 2007, CLEF.

[5]  Marta R. Costa-jussà,et al.  CROSS-LANGUAGE DOCUMENT RETRIEVAL BY USING NONLINEAR SEMANTIC MAPPING , 2013, Appl. Artif. Intell..

[6]  Treebanks Treebanks Building and Using Parsed Corpora , 2011 .

[7]  German Rigau,et al.  Book Reviews: EuroWordNet: A Multilingual Database with Lexical Semantic Networks , 1999, CL.

[8]  Marta R. Costa-jussà,et al.  Workshop on Hybrid Approaches to Translation: Overview and Developments , 2013, HyTra@ACL.

[9]  Bruno Pouliquen,et al.  Multilingual and cross-lingual news topic tracking , 2004, COLING.

[10]  Christopher Joseph Pal,et al.  Cross Lingual Adaptation: An Experiment on Sentiment Classifications , 2010, ACL.

[11]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[12]  Djoerd Hiemstra,et al.  Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics , 2012, Lecture Notes in Computer Science.

[13]  Philipp Koehn,et al.  Findings of the 2015 Workshop on Statistical Machine Translation , 2015, WMT@EMNLP.

[14]  Fabian M. Suchanek,et al.  YAGO3: A Knowledge Base from Multilingual Wikipedias , 2015, CIDR.

[15]  Kimmo Kettunen,et al.  Choosing the Best MT Programs for CLIR Purposes - Can MT Metrics Be Helpful? , 2009, ECIR.

[16]  Mikel L. Forcada Open-source machine translation between small languages : Catalan and Aranese Occitan Carme , 2006 .

[17]  Imed Zitouni,et al.  Multilingual Natural Language Processing Applications: From Theory to Practice , 2012 .

[18]  Benno Stein,et al.  Cross-Language Text Classification Using Structural Correspondence Learning , 2010, ACL.

[19]  Steven Skiena,et al.  International Sentiment Analysis for News and Blogs , 2021, ICWSM.

[20]  Wojciech Skut,et al.  A Linguistically Interpreted Corpus of German Newspaper Text , 1998, LREC.

[21]  Claire Cardie,et al.  Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora , 2011, ACL.

[22]  Asunción Gómez-Pérez,et al.  Ontology Localization , 2012, Ontology Engineering in a Networked World.

[23]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[24]  Marta R. Costa-jussà How much hybridization does machine translation Need? , 2015, J. Assoc. Inf. Sci. Technol..

[25]  Chris Quirk,et al.  Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[26]  Soto Montalvo,et al.  Multilingual news clustering: Feature translation vs. identification of cognate named entities , 2007, Pattern Recognit. Lett..

[27]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[28]  Bruno Pouliquen,et al.  Story tracking: linking similar news over time and across languages , 2008, COLING 2008.

[29]  C. Federmann,et al.  Hybrid Architectures for Multi-Engine Machine Translation , 2008, TC.

[30]  Xiaojun Wan,et al.  Co-Training for Cross-Lingual Sentiment Classification , 2009, ACL.

[31]  Rada Mihalcea,et al.  Learning Multilingual Subjective Language via Cross-Lingual Projections , 2007, ACL.

[32]  David Yarowsky,et al.  Inducing Multilingual Text Analysis Tools via Robust Projection across Aligned Corpora , 2001, HLT.

[33]  Benno Stein,et al.  Information Access Evaluation. Multilinguality, Multimodality, and Visualization , 2013, Lecture Notes in Computer Science.

[34]  Marta Iglesias-Sucasas,et al.  The FAO Geopolitical Ontology: A Reference for Country-Based Information , 2013 .

[35]  Asunción Gómez-Pérez,et al.  A note on ontology localization , 2010, Appl. Ontology.

[36]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .

[37]  Massimiliano Di Penta,et al.  Proceedings of the 10th International Conference on Predictive Models in Software Engineering , 2012, PROMISE 2012.

[38]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[39]  Srinivas Bangalore Transplanting supertags from English to Spanish , 1998, TAG+.

[40]  Tomaz Erjavec,et al.  The JRC-Acquis: A Multilingual Aligned Parallel Corpus with 20+ Languages , 2006, LREC.

[41]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[42]  Víctor M. Sánchez-Cartagena,et al.  The Universitat d'Alacant hybrid machine translation system for WMT 2011 , 2011, WMT@EMNLP.

[43]  Marko Grobelnik,et al.  Event registry: learning about world events from news , 2014, WWW.

[44]  Kazuaki Kishida Prediction of performance of cross-language information retrieval using automatic evaluation of translation , 2008 .

[45]  Rada Mihalcea,et al.  A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources , 2008, LREC.

[46]  Turid Hedlund,et al.  Dictionary-Based Cross-Language Information Retrieval: Learning Experiences from CLEF 2000–2002 , 2004, Information Retrieval.

[47]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.