Creating Personal Histories from the Web Using Namesake Disambiguation and Event Extraction

We have developed a system for gathering information from the Web, using it to create a personal history, and presenting it as a chronological table. It simplifies the task of sorting out the information for various namesakes and dealing with information in widely scattered sources. The system comprises five components: namesake disambiguation, date expression extraction, date expression normalization and completion, relevant information extraction, and chronological table generation.

[1]  Xiaojun Wan,et al.  Person resolution in person search results: WebHawk , 2005, CIKM '05.

[2]  George Karypis,et al.  Evaluation of hierarchical clustering algorithms for document datasets , 2002, CIKM '02.

[3]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[4]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[5]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[6]  David W. Embley,et al.  Grouping search-engine returned citations for person-name queries , 2004, WIDM '04.

[7]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[8]  Masao Fuketa,et al.  A method for understanding time expressions , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).