Information retrieval from historical newspaper collections in highly inflectional languages: A query expansion approach
暂无分享,去创建一个
Kimmo Kettunen | Eero Sormunen | Anni Järvelin | Heikki Keskustalo | Miamaria Saastamoinen | Heikki Keskustalo | Eero Sormunen | K. Kettunen | Anni Järvelin | Miamaria Saastamoinen
[1] Klaus U. Schulz,et al. Information Access to Historical Documents from the Early New High German Period , 2006, Digital Historical Corpora.
[2] Lon-Mu Liu,et al. Adaptive post-processing of OCR text via knowledge acquisition , 1991, CSC '91.
[3] Norbert Fuhr,et al. Retrieval in text collections with historic spelling using linguistic and spelling variants , 2007, JCDL '07.
[4] Gonzalo Navarro,et al. A guided tour to approximate string matching , 2001, CSUR.
[5] Rose Holley,et al. How Good Can It Get? Analysing and Improving OCR Accuracy in Large Scale Historic Newspaper Digitisation Programs , 2009, D Lib Mag..
[6] James Mayfield,et al. Character N-Gram Tokenization for European Language Text Retrieval , 2004, Information Retrieval.
[7] Kalervo Järvelin,et al. Frequency-based identification of correct translation equivalents (FITE) obtained through transformation rules , 2007, TOIS.
[8] Peter Willett,et al. Applications of n-grams in textual information systems , 1998, J. Documentation.
[9] Dawn Archer,et al. Travelling through time with corpus annotation software , 2008 .
[10] Peter Willett,et al. A Comparison of Spelling-Correction Methods for the Identification of Word Forms in Historical Text Databases , 1993 .
[11] Jaana Kekäläinen,et al. The Co-Effects of Query Structure and Expansion on Retrieval Performance in Probabilistic Text Retrieval , 2004, Information Retrieval.
[12] Eero Sormunen,et al. Liberal relevance criteria of TREC -: counting on negligible documents? , 2002, SIGIR '02.
[13] Kalervo Järvelin,et al. s-grams: Defining generalized n-grams for information retrieval , 2007, Inf. Process. Manag..
[14] M. de Rijke,et al. A Cross-Language Approach to Historic Document Retrieval , 2006, ECIR.
[15] Anni Järvelin,et al. Comparison of s-gram Proximity Measures in Out-of-Vocabulary Word Translation , 2008, SPIRE.
[16] Julian R. Ullmann,et al. A Binary n-Gram Technique for Automatic Correction of Substitution, Deletion, Insertion and Reversal Errors in Words , 1977, Comput. J..
[17] W. Bruce Croft,et al. Probabilistic Retrieval of OCR Degraded Text Using N-Grams , 1997, ECDL.
[18] Kimmo Kettunen. Managing word form variation of text retrieval in practice – Why language technology is not the only cure for better IR performance? , 2013 .
[19] Ismo Raitanen. "Etsikäät hywää ja älläät pahaa." Tiedonhakumenetelmien tuloksellisuuden vertailu merkkivirheitä sisältävässä historiallisessa sanomalehtikokoelmassa , 2012 .
[20] Ellen M. Voorhees,et al. The TREC-5 Confusion Track: Comparing Retrieval Methods for Scanned Text , 2000, Information Retrieval.
[21] Paul McNamee,et al. Using Syllables As Indexing Terms in Full-Text Information Retrieval , 2010, Baltic HLT.
[22] J. Mollon,et al. Comparison at a Distance , 2003, Perception.
[23] Kalervo Järvelin,et al. Non-adjacent Digrams Improve Matching of Cross-Lingual Spelling Variants , 2003, SPIRE.
[24] Hartmut Walravens. A NORDIC DIGITAL NEWSPAPER LIBRARY , 2006 .
[25] Turid Hedlund,et al. Dictionary-Based Cross-Language Information Retrieval: Learning Experiences from CLEF 2000–2002 , 2004, Information Retrieval.
[26] Gary Marchionini,et al. Examining the effectiveness of real-time query expansion , 2007, Inf. Process. Manag..
[27] Majlis Bremer-Laamanen. The nordic digital newspaper library , 2001 .
[28] Kalervo Järvelin,et al. Targeted s-gram matching: a novel n-gram matching technique for cross- and mono-lingual word form variants , 2002, Inf. Res..
[29] Kimmo Kettunen,et al. Is a Morphologically Complex Language Really that Complex in Full-Text Retrieval? , 2006, FinTAL.
[30] Norbert Fuhr,et al. Generating Search Term Variants for Text Collections with Historic Spellings , 2006, ECIR.
[31] Kalervo Järvelin,et al. A Dictionary- and Corpus-Independent Statistical Lemmatizer for Information Retrieval in Low Resource Languages , 2010, CLEF.
[32] Ida G. Sprinkhuizen-Kuyper,et al. Information Retrieval from Historical Corpora , 2002 .
[33] Eric C. Jensen,et al. A Survey of Retrieval Strategies for OCR Text Collections , 2002 .
[34] Alexander M. Robertson,et al. Word Variant Identification in Old French , 1997, Inf. Res..
[35] Ari Pirkola,et al. The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval , 1998, SIGIR '98.
[36] Falk Scholer,et al. Metric and Relevance Mismatch in Retrieval Evaluation , 2009, AIRS.
[37] James Mayfield,et al. Addressing morphological variation in alphabetic languages , 2009, SIGIR.
[38] Kimmo Kettunen. Reductive and generative approaches to management of morphological variation of keywords in monolingual information retrieval: An overview , 2009, J. Documentation.
[39] Kimmo Koskenniemi,et al. A General Computational Model for Word-Form Recognition and Production , 1984 .
[40] Kalervo Järvelin,et al. Restricted inflectional form generation in management of morphological keyword variation , 2007, Information Retrieval.
[41] Klaus U. Schulz,et al. Towards information retrieval on historical document collections: the role of matching procedures and special lexica , 2010, International Journal on Document Analysis and Recognition (IJDAR).
[42] Jacques Savoy,et al. Comparative information retrieval evaluation for scanned documents , 2011 .
[43] Kimmo Kettunen,et al. Does dictionary based bilingual retrieval work in a non-normalized index? , 2009, Inf. Process. Manag..
[44] Anni Järvelin,et al. Dictionary-independent translation in CLIR between closely related languages , 2006 .
[45] Dawn Archer,et al. The Identification of Spelling Variants in English and German Historical Texts: Manual or Automatic? , 2008, Lit. Linguistic Comput..
[46] Michele Flammini,et al. Improved Stable Retrieval in Noisy Collections , 2011, ICTIR.
[47] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.
[48] Kalervo Järvelin,et al. Frequent Case Generation in Ad Hoc Retrieval of Three Indian Languages - Bengali, Gujarati and Marathi , 2011, FIRE.
[49] Wolfram Luther,et al. Comparison of distance measures for historical spelling variants , 2006, IFIP AI.
[50] Kazem Taghva,et al. Results of applying probabilistic IR to OCR text , 1994, SIGIR '94.
[51] Riitta Alkula. From Plain Character Strings to Meaningful Words: Producing Better Full Text Databases for Inflectional and Compounding Languages with Morphological Analysis Software , 2004, Information Retrieval.
[52] Eero Sormunen,et al. A Method for Measuring Wide Range Performance of Boolean Queries in Full-Text Databases , 2000 .
[53] Mandar Mitra,et al. Information Retrieval from Documents: A Survey , 2000, Information Retrieval.
[54] James Allan,et al. Automatic Query Expansion Using SMART: TREC 3 , 1994, TREC.
[55] Norbert Fuhr,et al. Rule-based Search in Text Databases with Nonstandard Orthography , 2006, Lit. Linguistic Comput..