Extraction of Career Profiles from Wikipedia

In this paper, we describe a system that gathers the work experience of a person from her or his Wikipedia page. We first extract an ontology of profession names from the Wikidata graph. We then parse the Wikipedia pages using a dependency parser and we connect persons to professions through the analysis of parts of speech and dependency relations we extract from text. Setting aside the dates, we computed recall and precision scores on a very limited and preliminary test set for which we could reach a recall of 74% and a precision of 95%, showing our approach is promising.

[1]  Ryan Benjamin Shaw Events and Periods as Concepts for Organizing Historical Knowledge , 2010 .

[2]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[3]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[4]  Dirk Riehle,et al.  Design and implementation of wiki content transformations and refactorings , 2013, OpenSym.

[5]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[6]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.

[7]  Oren Etzioni,et al.  Open Language Learning for Information Extraction , 2012, EMNLP.

[8]  Ralph Grishman,et al.  Message Understanding Conference- 6: A Brief History , 1996, COLING.

[9]  Emmanuel Roche,et al.  Finite-State Language Processing , 1997 .

[10]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[11]  Gerhard Weikum,et al.  Timely YAGO: harvesting, querying, and visualizing temporal knowledge from Wikipedia , 2010, EDBT '10.

[12]  Robert Östling,et al.  Stagger: an Open-Source Part of Speech Tagger for Swedish , 2013 .

[13]  Lucien Tesnière Éléments de syntaxe structurale , 1959 .

[14]  Douglas E. Appelt,et al.  FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text , 1997, ArXiv.

[15]  Pierre Nugues,et al.  Using Semantic Role Labeling to Extract Events from Wikipedia , 2011, DeRiVE@ISWC.