A probabilistic relational model for keyword extraction

A large portion of real-world data is stored relational. In constant, most statistical learning methods work only with “flat” data representations. Probabilistic relational models (PRMs) is one of the models considers the characteristics of relational data. PRMs allow the properties of an object to depend probabilistically both on other properties of that object and on properties of related objects. In this paper an attempt is made to heed keywords extraction. The keywords are not only essential for academic papers but also important for web page retrieval, text mining, and document classification. In this paper, a C++ algorithm is presented to extract keywords. The algorithm generates and extracts keywords from a poetry book. The C++ programming language is used to implement our algorithm in order to obtain experimental results. The results indicate that “Love”, “Heart”, and “Eyes” are the most important keywords of the selected book.

[1]  Anjo Anjewierden,et al.  Automatic indexing of documents with ontologies , 2001 .

[2]  Peter D. Turney Learning Algorithms for Keyphrase Extraction , 2000, Information Retrieval.

[3]  Peter D. Turney Learning to Extract Keyphrases from Text , 2002, ArXiv.

[4]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[5]  Ilyas Cicekli,et al.  Using lexical chains for keyword extraction , 2007, Inf. Process. Manag..

[6]  Jihoon Yang,et al.  Extracting sentence segments for text summarization: a machine learning approach , 2000, SIGIR '00.

[7]  A. Waibel,et al.  A Literature Survey on Information Extraction and Text Summarization , 1997 .

[8]  R. Bhowmik,et al.  Keyword extraction from abstracts and titles , 2008, IEEE SoutheastCon 2008.

[9]  Anette Hulth,et al.  Improved Automatic Keyword Extraction Given More Linguistic Knowledge , 2003, EMNLP.

[10]  Andreas Rauber,et al.  SOMLib: a digital library system based on neural networks , 1999, DL '99.

[11]  Mitsuru Ishizuka,et al.  Keyword extraction from a single document using word co-occurrence statistical information , 2004, Int. J. Artif. Intell. Tools.

[12]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[13]  Frank Smadja,et al.  Retrieving Collocations from Text: Xtract , 1993, CL.

[14]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[15]  Klaus Zechner,et al.  Fast Generation of Abstracts from General Domain Text Corpora by Extracting Relevant Sentences , 1996, COLING.

[16]  Hans Peter Luhn,et al.  A Statistical Approach to Mechanized Encoding and Searching of Literary Information , 1957, IBM J. Res. Dev..

[17]  Matthew Hurst,et al.  A Language Model Approach to Keyphrase Extraction , 2003, ACL 2003.

[18]  Dunja Mladenic,et al.  Assigning Keywords to Documents Using Machine Learning , 1999 .

[19]  SmadjaFrank Retrieving collocations from text , 1993 .

[20]  Carl Gutwin,et al.  Domain-Specific Keyphrase Extraction , 1999, IJCAI.

[21]  Lois L. Earl,et al.  Experiments in automatic extracting and indexing , 1970, Inf. Storage Retr..

[22]  Mark T. Maybury,et al.  Advances in Automatic Text Summarization , 1999 .

[23]  Li Su Research on Maximum Entropy Model for Keyword Indexing , 2004 .

[24]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[25]  M. Degroot Optimal Statistical Decisions , 1970 .

[26]  Kazuhiro Takeuchi,et al.  NTT/NAIST's Text Summarization Systems for TSC-2 , 2002, NTCIR.