Searching in the real world

For many, "searching" is considered a mostly solved problem. In fact, for text processing, this belief is factually based. The problem is that most "real world" search applications involve "complex documents", and such applications are far from solved. Complex documents, or less formally, "real world documents", comprise of a mixture of images, text, signatures, tables, logos, water-marks, stamps, etc, and are often available only in scanned hardcopy formats. Search systems for such document collections are currently unavailable.

[1]  Sargur N. Srihari,et al.  Offline Signature Verification And Identification Using Distance Statistics , 2004, Int. J. Pattern Recognit. Artif. Intell..

[2]  Shlomo Argamon,et al.  Building a test collection for complex document information processing , 2006, SIGIR.

[3]  Philip J. Bernhard,et al.  Industrial evaluation of a highly-accurate academic IR system , 2003, CIKM '03.

[4]  Ophir Frieder,et al.  Degraded document image enhancement , 2007, Electronic Imaging.

[5]  Ophir Frieder,et al.  IIT Intranet Mediator: bringing data together on a corporate intranet , 2002 .

[6]  Ophir Frieder,et al.  On Mediated Search of the United States Holocaust Memorial Museum Data , 2006, NGITS.

[7]  Shlomo Argamon,et al.  Complex document information processing: prototype, test collection, and evaluation , 2006, Electronic Imaging.

[8]  Ophir Frieder,et al.  On the development of name search techniques for Arabic , 2006 .

[9]  Ophir Frieder,et al.  On bidirectional English-Arabic search , 2002, J. Assoc. Inf. Sci. Technol..

[10]  Ophir Frieder,et al.  On arabic search: improving the retrieval effectiveness via a light stemming approach , 2002, CIKM '02.

[11]  Ophir Frieder,et al.  Information Retrieval: Algorithms and Heuristics , 1998 .

[12]  Siyuan Chen,et al.  Document image retrieval using signatures as queries , 2006, Second International Conference on Document Image Analysis for Libraries (DIAL'06).

[13]  Ellen M. Voorhees,et al.  A complex document information processing prototype , 2006, SIGIR '06.