Extracting query modifications from nonlinear SVMs

When searching the WWW, users often desire results restricted to a particular document category. Ideally, a user would be able to filter results with a text classifier to minimize false positive results; however, current search engines allow only simple query modifications. To automate the process of generating effective query modifications, we introduce a sensitivity analysis-based method for extracting rules from nonlinear support vector machines. The proposed method allows the user to specify a desired precision while attempting to maximize the recall. Our method performs several levels of dimensionality reduction and is vastly faster than searching the combination feature space; moreover, it is very effective on real-world data.

[1]  John C. Platt Using Analytic QP and Sparseness to Speed Training of Support Vector Machines , 1998, NIPS.

[2]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[3]  Gary William Flake,et al.  Efficient SVM Regression Training with SMO , 2002, Machine Learning.

[4]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6]  Oren Etzioni,et al.  The MetaCrawler architecture for resource aggregation on the Web , 1997 .

[7]  William P. Birmingham,et al.  Architecture of a metasearch engine that supports user information needs , 1999, CIKM '99.

[8]  Jude W. Shavlik,et al.  Extracting Refined Rules from Knowledge-Based Neural Networks , 1993, Machine Learning.

[9]  Adele E. Howe,et al.  SAVVYSEARCH: A Metasearch Engine That Learns Which Search Engines to Query , 1997, AI Mag..

[10]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[11]  C. Lee Giles,et al.  Context and Page Analysis for Improved Web Search , 1998, IEEE Internet Comput..

[12]  Guijun Wang,et al.  ProFusion*: Intelligent Fusion from Multiple, Distributed Search Engines , 1996, J. Univers. Comput. Sci..

[13]  William P. Birmingham,et al.  Improving category specific Web search by learning query modifications , 2001, Proceedings 2001 Symposium on Applications and the Internet.

[14]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.