Making Virtue of Necessity: A Verb Lexicon

We describe the verb lexicon of OpenWordNet-PT, a wordnet-like resource for (mostly Brazilian) Portuguese and a series of experiments that we designed to extend its coverage. These experiments include checking online lists of most common verbs, checking corpora freely available such as the Bosque-UD (the Bosque corpus annotated with Universal Dependencies) and especially checking a dictionary of Brazilian politicians’ biographies (the DHBB) that we consider an ideal corpus for the kind of information extraction we are after. We certainly succeeded into extending the coverage of the verb lexicon, however it remains to be seen whether this new coverage is enough for the original application.

[1]  Martha Palmer,et al.  Verb Clustering for Brazilian Portuguese , 2014, CICLing.

[2]  Carolina Scarton,et al.  Towards a cross-linguistic VerbNet-style lexicon for Brazilian Portuguese , 2014 .

[3]  Márcia Cançado,et al.  The construction of a catalog of Brazilian Portuguese verbs , 2012, KONVENS.

[4]  Maarten Janssen,et al.  The Common Orthographic Vocabulary of the Portuguese Language: a set of open lexical resources for a pluricentric language , 2012, LREC.

[5]  Xavier Carreras,et al.  FreeLing: An Open-Source Suite of Language Analyzers , 2004, LREC.

[6]  Valeria de Paiva,et al.  Seeing is Correcting: curating lexical resources using social interfaces , 2015, LDL@IJCNLP.

[7]  Aldo Gangemi,et al.  RDF/OWL Representation of WordNet , 2006 .

[8]  Gerard de Melo,et al.  OpenWordNet-PT: An Open Brazilian Wordnet for Reasoning , 2012, COLING.

[9]  Xavier Gómez Guinovart,et al.  Bootstrapping a Portuguese WordNet from Galician, Spanish and English Wordnets , 2014, IberSPEECH.

[10]  Eckhard Bick,et al.  Floresta Sintá(c)tica: A treebank for Portuguese , 2002, LREC.

[11]  Samuel R. Bowman,et al.  A Gold Standard Dependency Corpus for English , 2014, LREC.

[12]  Gerard de Melo,et al.  Exploratory Information Extraction from a Historical Dictionary , 2014, 2014 IEEE 10th International Conference on e-Science.

[13]  Patrick Pantel,et al.  VerbOcean: Mining the Web for Fine-Grained Semantic Verb Relations , 2004, EMNLP.

[14]  Francis Bond,et al.  A Survey of WordNets and their Licenses , 2011 .

[15]  Bento Carlos Dias-da-Silva,et al.  A construção de um thesaurus eletrônico para o Português do Brasil , 2003 .

[16]  Jorge Baptista ViPEr: A Lexicon-Grammar of European Portuguese Verbs , 2012 .

[17]  Carolina Scarton VerbNet.Br: construção semiautomática de um léxico computacional de verbos para o português do Brasil (VerbNet.Br: semiautomatic construction of a computational verb lexicon for Brazilian Portuguese) [in Portuguese] , 2011, STIL.