Análisis morfosintáctico y clasificación de entidades nombradas en un entorno Big Data

Este trabajo ha sido subvencionado con cargo a los proyectos HPCPLN - Ref:EM13/041 (Programa Emergentes, Xunta de Galicia), Celtic - Ref:2012-CE138 y Plastic - Ref:2013-CE298 (Programa Feder-Innterconecta).

[1]  Michele Banko,et al.  Part-of-Speech Tagging in Context , 2004, COLING.

[2]  Jimmy J. Lin Scalable Language Processing Algorithms for the Masses: A Case Study in Computing Word Co-occurrence Matrices with MapReduce , 2008, EMNLP.

[3]  Marcos Garcia,et al.  Identificação e classificação de entidades mencionadas em galego , 2012 .

[4]  Adam Kilgarriff Googleology is Bad Science , 2007, Computational Linguistics.

[5]  Lluís Padró,et al.  FreeLing 3.0: Towards Wider Multilinguality , 2012, LREC.

[6]  Eric Crestan,et al.  Web-Scale Distributional Similarity and Entity Set Expansion , 2009, EMNLP.

[7]  Helmut Schmid,et al.  Improvements in Part-of-Speech Tagging with an Application to German , 1999 .

[8]  Ian T. Foster,et al.  A distributed look-up architecture for text mining applications using MapReduce , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[9]  Jimmy J. Lin,et al.  Fast, Easy, and Cheap: Construction of Statistical Machine Translation Models with MapReduce , 2008, WMT@ACL.

[10]  Xavier Carreras,et al.  FreeLing: An Open-Source Suite of Language Analyzers , 2004, LREC.

[11]  Kalina Bontcheva,et al.  GATECloud.net: a platform for large-scale, open-source text processing on the cloud , 2013, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[12]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[13]  Donald Metzler,et al.  Mavuno: a scalable and effective Hadoop-based paraphrase acquisition system , 2011, LDMTA '11.

[14]  Pablo Gamallo,et al.  A Resource-Based Method for Named Entity Extraction and Classification , 2011, EPIA.

[15]  Mariona Taulé,et al.  AnCora: Multilevel Annotated Corpora for Catalan and Spanish , 2008, LREC.