Semantic Annotation and Retrieval of Parliamentary Content: A Case Study on the Spanish Congress of Deputies

In this paper, we present an ontology-based annotation and retrieval approach for parliamentary content, such as debate transcripts and law proposals. Exploiting a number of domain ontologies, semantic web technologies and information retrieval techniques, our approach extracts topics, concepts and named entities (e.g., names of politicians and political parties) appearing in input documents. The domain ontologies were designed to support multilinguality, and were built from the United Nations taxonomy of sustainable development goals. The approach was instantiated with a text corpus extracted from the Spanish Congress of Deputies and is being integrated into an e-government platform.

[1]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[2]  Luis M. de Campos,et al.  Use of textual and conceptual profiles for personalized retrieval of political documents , 2016, Knowl. Based Syst..

[3]  Xiaoxing Liu,et al.  Research and Design on E-government Information Retrieval Model , 2012 .

[4]  Luis M. de Campos,et al.  An Integrated System for Accessing the Digital Library of the Parliament of Andalusia: Segmentation, Annotation and Retrieval of Transcriptions and Videos , 2008, PRIS.

[5]  Flora Amato,et al.  A system for semantic retrieval and long-term preservation of multimedia documents in the e-government domain , 2009, Int. J. Web Grid Serv..

[6]  Gaku Morio,et al.  Annotating Online Civic Discussion Threads for Argument Mining , 2018, 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI).

[7]  Enrico Motta,et al.  Semantically enhanced Information Retrieval: An ontology-based approach , 2011, J. Web Semant..

[8]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[9]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[10]  Mark A. Musen,et al.  The protégé project: a look back and a look forward , 2015, SIGAI.

[11]  Evgeniy Gabrilovich,et al.  Wikipedia-based Semantic Interpretation for Natural Language Processing , 2014, J. Artif. Intell. Res..

[12]  Claire Cardie,et al.  An eRulemaking Corpus: Identifying Substantive Issues in Public Comments , 2008, LREC.

[13]  P. C. Wong,et al.  Generalized vector spaces model in information retrieval , 1985, SIGIR '85.

[14]  Vlad Eidelman,et al.  Argument Identification in Public Comments from eRulemaking , 2019, ICAIL.

[15]  Maarten Marx,et al.  Focused retrieval and result aggregation with political data , 2010, Information Retrieval.

[16]  Hugh Glaser,et al.  Linked Open Government Data: Lessons from Data.gov.uk , 2012, IEEE Intelligent Systems.

[17]  Pablo Castells,et al.  A Heuristic Approach to Semantic Web Services Classification , 2006, KES.

[18]  Ricardo Miranda Barcia,et al.  An Intelligent Search Engine for Electronic Government Applications for the Resolutions of the United Nations Security Council , 2004, I3E.

[19]  Carlos León,et al.  Semantic Framework for an Efficient Information Retrieval in the E-Government Repositories , 2015 .

[20]  Luis M. de Campos,et al.  Comparing Machine Learning and Information Retrieval-Based Approaches for Filtering Documents in a Parliamentary Setting , 2017, SUM.

[21]  George Tsatsaronis,et al.  A Generalized Vector Space Model for Text Retrieval Based on Semantic Relatedness , 2009, EACL.

[22]  Giovanni Semeraro,et al.  Cross-Language Semantic Retrieval and Linking of E-Gov Services , 2013, SEMWEB.

[23]  Elena Sánchez-Nielsen,et al.  Personalized and On-Demand Retrieval of Parliamentary Proceedings with Social Feedback on Elected Representatives , 2008, JURIX.

[24]  Juan M. Fernández-Luna,et al.  Development of the XML Digital Library from the Parliament of Andalucía for Intelligent Structured Retrieval , 2008, ISMIS.

[25]  Kalina Bontcheva,et al.  The evolution of argumentation mining: From models to social media and emerging tools , 2019, Inf. Process. Manag..

[26]  Javier Lorenzo-Navarro,et al.  A semantic parliamentary multimedia approach for retrieval of video clips with content understanding , 2019, Multimedia Systems.

[27]  Mário J. Silva,et al.  POWER - Politics Ontology for Web Entity Retrieval , 2011, CAiSE 2011.