Analysis of Documents Born Digital
暂无分享,去创建一个
Ying Liu | Jianying Hu | Y. Liu | Jianying Hu
[1] Barry Smyth,et al. Fact or Fiction: Content Classification for Digital Libraries , 2001, DELOS.
[2] Kun Bai,et al. TableSeer: automatic table metadata extraction and searching in digital libraries , 2007, JCDL '07.
[3] Deepayan Chakrabarti,et al. A graph-theoretic approach to webpage segmentation , 2008, WWW.
[4] Tamir Hassan,et al. Object-level document analysis of PDF files , 2009, DocEng '09.
[5] David F. Brailsford,et al. Document analysis of PDF files: methods, results and implications , 1995 .
[6] David F. Brailsford,et al. Towards structured, block-based PDF , 1995 .
[7] Matti Pietikäinen,et al. Page Segmentation and Zone Classification: The State of the Art , 1999 .
[8] Din-Chang Tseng,et al. Overlapped-character separation and reconstruction for table-form documents , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.
[9] David W. Embley,et al. Conceptual-Model-Based Data Extraction from Multiple-Record Web Pages , 1999, Data Knowl. Eng..
[10] Anil K. Jain,et al. Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).
[11] Apostolos Antonacopoulos,et al. Colour text segmentation in web images based on human perception , 2007, Image Vis. Comput..
[12] Wolfgang Gatterbauer,et al. Table Extraction Using Spatial Reasoning on the CSS2 Visual Box Model , 2006, AAAI.
[13] Stephen Soderland,et al. Learning Information Extraction Rules for Semi-Structured and Free Text , 1999, Machine Learning.
[14] Yalin Wang,et al. Detecting Tables in HTML Documents , 2002, Document Analysis Systems.
[15] Simone Marinai,et al. Metadata Extraction from PDF Papers for Digital Library Ingest , 2009, 2009 10th International Conference on Document Analysis and Recognition.
[16] Berthier A. Ribeiro-Neto,et al. A brief survey of web data extraction tools , 2002, SGMD.
[17] Gunther Wyszecki,et al. Color Science: Concepts and Methods, Quantitative Data and Formulae, 2nd Edition , 2000 .
[18] Brad Adelberg,et al. NoDoSE—a tool for semi-automatically extracting structured and semistructured data from text documents , 1998, SIGMOD '98.
[19] Anthony G. Cohn,et al. Qualitative Spatial Representation and Reasoning Techniques , 1997, KI.
[20] Kathleen R. McKeown,et al. Columbia multi-document summarization : Approach and evaluation , 2001 .
[21] Andreas Dengel,et al. Document Analysis Systems VI , 2004, Lecture Notes in Computer Science.
[22] Wei-Ying Ma,et al. Extracting Content Structure for Web Pages Based on Visual Representation , 2003, APWeb.
[23] Jian Fan,et al. Layout and Content Extraction for PDF Documents , 2004, Document Analysis Systems.
[24] G. Grisetti,et al. Further Reading , 1984, IEEE Spectrum.
[25] Yalin Wang,et al. Statistical-based approach to word segmentation , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.
[26] Wolfgang Gatterbauer,et al. Using visual cues for extraction of tabular data from arbitrary HTML documents , 2005, WWW '05.
[27] Neha Gupta,et al. A Heuristic Approach for Web Content Extraction , 2011 .
[28] Tamir Hassan. User-Guided Wrapping of PDF Documents Using Graph Matching Techniques , 2009, 2009 10th International Conference on Document Analysis and Recognition.
[29] Jun Kong,et al. Spatial graph grammars for graphical user interfaces , 2006, TCHI.
[30] Wei-Ying Ma,et al. 2D Conditional Random Fields for Web information extraction , 2005, ICML.
[31] Ping Luo,et al. Web article extraction for web printing: a DOM+visual based approach , 2009, DocEng '09.
[32] Xing Xie,et al. Adapting Web pages for small-screen devices , 2005, IEEE Internet Computing.
[33] Katharina Kaiser,et al. pdf2table: A Method to Extract Table Information from PDF Files , 2005, IICAI.
[34] Maurizio Rigamonti,et al. Xed: a new tool for extracting hidden structures from electronic documents , 2004, First International Workshop on Document Image Analysis for Libraries, 2004. Proceedings..
[35] Massimo Ruffolo,et al. PDF-TREX: An Approach for Recognizing and Extracting Tables from PDF Documents , 2009, 2009 10th International Conference on Document Analysis and Recognition.
[36] S da SilvaAltigran,et al. A brief survey of web data extraction tools , 2002 .
[37] S da SilvaAltigran,et al. DEByE - Date extraction by example , 2002 .
[38] Bing Liu,et al. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.
[39] Pinar Duygulu Sahin,et al. A hierarchical representation of form documents for identification and retrieval , 2002, International Journal on Document Analysis and Recognition.
[40] Anjo Anjewierden. AIDAS: incremental logical structure discovery in PDF documents , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.
[41] Raymond J. Mooney,et al. Relational Learning of Pattern-Match Rules for Information Extraction , 1999, CoNLL.
[42] Bon K. Sy,et al. A Theoretical Foundation and a Method for Document Table Structure Extraction and Decompositon , 2002, Document Analysis Systems.
[43] Matthew Hurst,et al. Layout and Language: Challenges for Table Understanding on the Web , 2001 .
[44] W D Wright,et al. Color Science, Concepts and Methods. Quantitative Data and Formulas , 1967 .
[45] Nicholas Kushmerick,et al. Wrapper induction: Efficiency and expressiveness , 2000, Artif. Intell..
[46] Robert P. Futrelle,et al. Extraction,layout analysis and classification of diagrams in PDF documents , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..
[47] Dimosthenis A. Karatzas,et al. Text segmentation in web images using colour perception and topological features , 2003 .
[48] Apostolos Antonacopoulos,et al. Web Document Analysis: Challenges and Opportunities , 2003 .
[49] Daniel P. Lopresti,et al. Locating and Recognizing Text in WWW Images , 2000, Information Retrieval.
[50] J. Cordy,et al. A Survey of Table Recognition : Models , Observations , Transformations , and Inferences , 2003 .
[51] Nir Ailon,et al. Aggregating inconsistent information: Ranking and clustering , 2008 .