Automatic indexing of scanned documents: a layout-based approach
暂无分享,去创建一个
Alexander Schill | Daniel Schuster | Michael Berger | Daniel Esser | Klemens Muthmann | A. Schill | Daniel Schuster | Michael Berger | Klemens Muthmann | D. Esser
[1] Tansel Özyer,et al. Employing Clustering Techniques for Automatic Information Extraction From HTML Documents , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[2] Jan-Ming Ho,et al. Discovering informative content blocks from Web documents , 2002, KDD.
[3] Eric Saund. Scientific challenges underlying production document processing , 2011, Electronic Imaging.
[4] Hwee Tou Ng,et al. Closing the Gap: Learning-Based Information Extraction Rivaling Knowledge-Engineering Methods , 2003, ACL.
[5] Sriram Raghavan,et al. Avatar Information Extraction System , 2006, IEEE Data Eng. Bull..
[6] Wei-Ying Ma,et al. Extracting Content Structure for Web Pages Based on Visual Representation , 2003, APWeb.
[7] Valter Crescenzi,et al. RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.
[8] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.
[9] Rohit J. Kate,et al. Comparative experiments on learning information extractors for proteins and their interactions , 2005, Artif. Intell. Medicine.
[10] Jianying Hu,et al. Comparison and Classification of Documents Based on Layout Similarity , 2000, Information Retrieval.
[11] Anastasia Ailamaki,et al. Challenges inbuilding a DBMS Resource Advisor , 2006, IEEE Data Eng. Bull..
[12] Li Zhang,et al. Focused named entity recognition using machine learning , 2004, SIGIR '04.