A Novel Approach to Keyword Extraction for Contextual Advertising

Online advertising has now turned to be one of the major revenue sources for today's Internet companies. Among the different channels of advertising, contextual advertising takes the great part. There are already lots of studies done for the keyword extraction problem in contextual advertising for English, however, little has been conducted for Chinese, which is mainly different from English linguistically. In this paper, we focus on the problem of Chinese advertising keywords extraction and propose a novel approach based on the idea of classification. We adopt C4.5 as the classifier model and select appropriate features with Chinese linguistic characteristic taken into consideration. The experimental results indicate that our approach is promising.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  Joshua Goodman,et al.  Finding advertising keywords on web pages , 2006, WWW '06.

[3]  Carl Gutwin,et al.  Domain-Specific Keyphrase Extraction , 1999, IJCAI.

[4]  Raymond J. Mooney,et al.  Bottom-Up Relational Learning of Pattern Matching Rules for Information Extraction , 2003, J. Mach. Learn. Res..

[5]  Andrei Z. Broder,et al.  A semantic approach to contextual advertising , 2007, SIGIR.

[6]  Saturnino Luz,et al.  Automatic Hypertext Keyphrase Detection , 2005, IJCAI.

[7]  Joshua Goodman,et al.  Implicit Queries for Email , 2005, CEAS.

[8]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[9]  Jiawei Han,et al.  Data Mining: Concepts and Techniques, Second Edition , 2006, The Morgan Kaufmann series in data management systems.

[10]  Anette Hulth,et al.  Improved Automatic Keyword Extraction Given More Linguistic Knowledge , 2003, EMNLP.

[11]  Monika Henzinger,et al.  Query-Free News Search , 2003, WWW '03.

[12]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[13]  Weiguo Fan,et al.  Learning to advertise , 2006, SIGIR.

[14]  Hwee Tou Ng,et al.  A maximum entropy approach to information extraction from semi-structured and free text , 2002, AAAI/IAAI.

[15]  Vassilis Plachouras,et al.  SEMANTIC ASSOCIATIONS FOR CONTEXTUAL ADVERTISING , 2008 .

[16]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .