A probabilistic approach for mapping free-text queries to complex web forms

Web applications with complex interfaces consisting of multiple input fields should understand free-text queries. We propose a probabilistic approach to map parts of a free-text query to the fields of a complex web form. Our method uses token models rather than only static dictionaries to create this mapping, offering greater flexibility and requiring less domain knowledge than existing systems. We evaluate different implementations of our mapping model and show that our system effectively maps free-text queries without using a dictionary. If a dictionary is available, the performance increases and is significantly better than a rule-based baseline.

[1]  Sandeep Tata,et al.  SQAK: doing more with keywords , 2008, SIGMOD Conference.

[2]  Gary G. Hendrix,et al.  Developing a natural language interface to complex data , 1977, TODS.

[3]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[4]  ChengXiang Zhai,et al.  Unsupervised query segmentation using clickthrough for information retrieval , 2011, SIGIR '11.

[5]  Panayiotis Tsaparas,et al.  Structured annotations of web queries , 2010, SIGMOD Conference.

[6]  Jaime G. Carbonell,et al.  The XCALIBUR Project: A Natural Language Interface to Expert Systems , 1983, IJCAI.

[7]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[8]  Shaul Dar,et al.  DTL's DataSpot: Database Exploration Using Plain Language , 1998, VLDB.

[9]  Mitesh Patel,et al.  Structured databases on the web: observations and implications , 2004, SGMD.

[10]  Frank Meng A natural language interface for information retrieval from forms on the World Wide Web , 1999, ICIS.

[11]  B. Huberman,et al.  The Deep Web : Surfacing Hidden Value , 2000 .

[12]  Matthias Hagen,et al.  Query segmentation revisited , 2011, WWW.

[13]  Michael K. Bergman White Paper: The Deep Web: Surfacing Hidden Value , 2001 .

[14]  Peter Thanisch,et al.  Natural language interfaces to databases – an introduction , 1995, Natural Language Engineering.

[15]  Eser Kandogan,et al.  Avatar semantic search: a database approach to information retrieval , 2006, SIGMOD Conference.

[16]  Jamie Callan,et al.  DISTRIBUTED INFORMATION RETRIEVAL , 2002 .

[17]  Reda Alhajj,et al.  Simplified access to structured databases by adapting keyword search and database selection , 2004, SAC '04.

[18]  Rade Kutil,et al.  Distributed information retrieval using LSI , 2006 .

[19]  Peter Fankhauser,et al.  DivQ: diversification for keyword search over structured databases , 2010, SIGIR.

[20]  Chong Wang,et al.  SPARK: Adapting Keyword Query to Semantic Search , 2007, ISWC/ASWC.

[21]  Djoerd Hiemstra,et al.  Free-Text Search over Complex Web Forms , 2011, IRFC.