Binary Feature Selection and Integration in Specialized Search Engines

We present a methodology for rapid implementation of specia lized search engines. To catalog data, these search engines interpret and classify the content of w eb material to identify different representations of common domain-related elements. While designer s can typically develop multiple partial solutions for interpreting the data, acceptable relevance det rmination requires the appropriate integration of all of these solutions. We present a method for automa ically integrating such partial solutions in a Bayesian framework. The Bayesian framework produces a sea rch ngine where each user can control the false alarm rate in an intuitive yet rigorous fashion. We discuss the use of this technique in the construction of DEADLINER, a search engine that catalogs confe rence and seminar material found on the web.