Machine Learning Techniques for Document Processing and Web Security
暂无分享,去创建一个
[1] Herbert Bos,et al. Ruler: high-speed packet matching and rewriting on NPUs , 2007, ANCS '07.
[2] Niels Provos,et al. All Your iFRAMEs Point to Us , 2008, USENIX Security Symposium.
[3] Kalyanmoy Deb,et al. A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..
[4] Massimo Ruffolo,et al. XONTO: An Ontology-Based System for Semantic Information Extraction from PDF Documents , 2008, 2008 20th IEEE International Conference on Tools with Artificial Intelligence.
[5] Jan P. Allebach,et al. Document visual similarity measure for document search , 2011, DocEng '11.
[6] Wei Liu,et al. ViDE: A Vision-Based Approach for Deep Web Data Extraction , 2010, IEEE Transactions on Knowledge and Data Engineering.
[7] Elio Masciari,et al. A Fuzzy Logic Approach to Wrapping PDF Documents , 2011, IEEE Transactions on Knowledge and Data Engineering.
[8] Jayant Madhavan,et al. Harvesting Relational Tables from Lists on the Web , 2009, Proc. VLDB Endow..
[9] Juliana Freire,et al. Organizing Hidden-Web Databases by Clustering Visible Web Documents , 2007, 2007 IEEE 23rd International Conference on Data Engineering.
[10] Karen Kukich,et al. Techniques for automatically correcting words in text , 1992, CSUR.
[11] Richard M. Schwartz,et al. Named Entity Extraction from Noisy Input: Speech and OCR , 2000, ANLP.
[12] Shui-Lung Chuang,et al. Context-Aware Wrapping: Synchronized Data Extraction , 2007, VLDB.
[13] Michael Wick,et al. Context-Sensitive Error Correction: Using Topic Models to Improve OCR , 2007 .
[14] Sriram Raghavan,et al. Regular Expression Learning for Information Extraction , 2008, EMNLP.
[15] Eric Medvet,et al. The Reaction Time to Web Site Defacements , 2009, IEEE Internet Computing.
[16] Andreas Dengel,et al. Seizing the Treasure: Transferring Knowledge in Invoice Analysis , 2009, 2009 10th International Conference on Document Analysis and Recognition.
[17] Tyler Moore,et al. Measuring and Analyzing Search-Redirection Attacks in the Illicit Online Prescription Drug Trade , 2011, USENIX Security Symposium.
[18] Chih-Jen Lin,et al. Probability Estimates for Multi-class Classification by Pairwise Coupling , 2003, J. Mach. Learn. Res..
[19] María Dolores Rodríguez-Moreno,et al. Automatic Web Data Extraction Based on Genetic Algorithms and Regular Expressions , 2009, Data Mining and Multi-agent Integration.
[20] Brian J. Ross,et al. Probabilistic Pattern Matching and the Evolution of Stochastic Regular Expressions , 2000, Applied Intelligence.
[21] Alon Y. Halevy,et al. Data Integration for the Relational Web , 2009, Proc. VLDB Endow..
[22] C. M. Sperberg-McQueen,et al. Extensible Markup Language (XML) , 1997, World Wide Web J..
[23] Sargur N. Srihari,et al. Experiments in Text Recognition with Binary n-Gram and Viterbi Algorithms , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[24] Jeffrey E. F. Friedl. Mastering Regular Expressions , 1997 .
[25] Tae-Hoon Kim,et al. Automatic generation of XForms code using DTD , 2005, Fourth Annual ACIS International Conference on Computer and Information Science (ICIS'05).
[26] Hanchuan Peng,et al. Document Image Recognition Based on Template Matching of Component Block Projections , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[27] Boris Chidlovskii. Schema Extraction from XML: A Grammatical Inference Approach , 2001, KRDB.
[28] Jan-Ming Ho,et al. BibPro: A Citation Parser Based on Sequence Alignment , 2012, IEEE Trans. Knowl. Data Eng..
[29] Daniela Florescu. Managing Semi-Structured Data , 2005, ACM Queue.
[30] Aristides Gionis,et al. XTRACT: a system for extracting document type descriptors from XML documents , 2000, SIGMOD 2000.
[31] Somesh Jha,et al. Automatic Generation of Remediation Procedures for Malware Infections , 2010, USENIX Security Symposium.
[32] Stamatis Vassiliadis,et al. Regular Expression Matching in Reconfigurable Hardware , 2008, J. Signal Process. Syst..
[33] Lawrence K. Saul,et al. Identifying suspicious URLs: an application of large-scale online learning , 2009, ICML '09.
[34] Giovanni Vigna,et al. Prophiler: a fast filter for the large-scale detection of malicious web pages , 2011, WWW.
[35] Felix Naumann,et al. XStruct: Efficient Schema Extraction from Multiple and Large XML Documents , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).
[36] Michalis Faloutsos,et al. PhishDef: URL names say it all , 2010, 2011 Proceedings IEEE INFOCOM.
[37] Bertin Klein,et al. Results of a Study on Invoice-Reading Systems in Germany , 2004, Document Analysis Systems.
[38] Tamir Hassan. User-Guided Wrapping of PDF Documents Using Graph Matching Techniques , 2009, 2009 10th International Conference on Document Analysis and Recognition.
[39] Francesca Cesarini,et al. Analysis and understanding of multi-class invoices , 2003, Document Analysis and Recognition.
[40] Huajun Huang,et al. A SVM-based Technique to Detect Phishing URLs , 2012 .
[41] Eric Medvet,et al. A probabilistic approach to printed document understanding , 2011, International Journal on Document Analysis and Recognition (IJDAR).
[42] Eric Medvet,et al. Semisupervised Wrapper Choice and Generation for Print-Oriented Documents , 2014, IEEE Transactions on Knowledge and Data Engineering.
[43] Masakazu Suzuki,et al. Syntactic Detection and Correction of Misrecognitions in Mathematical OCR , 2009, 2009 10th International Conference on Document Analysis and Recognition.
[44] Thamar Solorio,et al. Lexical feature based phishing URL detection using online learning , 2010, AISec '10.
[45] Guofei Gu,et al. WebPatrol: automated collection and replay of web-based malware scenarios , 2011, ASIACCS '11.
[46] Frank Neven,et al. Learning deterministic regular expressions for the inference of schemas from XML data , 2010, ACM Trans. Web.
[47] Ken Thompson,et al. Programming Techniques: Regular expression search algorithm , 1968, Commun. ACM.
[48] Yuan An,et al. Understanding deep web search interfaces: a survey , 2010, SGMD.
[49] Justin Tung Ma,et al. Learning to detect malicious URLs , 2011, TIST.
[50] Ee-Peng Lim,et al. DTD-Miner: a tool for mining DTD from XML documents , 2000, Proceedings Second International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems. WECWIS 2000.
[51] William W. Cohen,et al. Extracting Personal Names from Email: Applying Named Entity Recognition to Informal Text , 2005, HLT.
[52] Eric Medvet,et al. A look at hidden web pages in Italian public administrations , 2012, 2012 Fourth International Conference on Computational Aspects of Social Networks (CASoN).
[53] Valter Crescenzi,et al. Wrapper Generation for Overlapping Web Sources , 2011, 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.
[54] Herbert Shiu,et al. Recovering data semantics from XML documents into DTD graph with SAX , 2006 .
[55] Eric Medvet,et al. A Framework for Large-Scale Detection of Web Site Defacements , 2010, TOIT.
[56] Ahmet Cetinkaya. Regular expression generation through grammatical evolution , 2007, GECCO '07.
[57] Ramana Rao Kompella,et al. PhishNet: Predictive Blacklisting to Detect Phishing Attacks , 2010, 2010 Proceedings IEEE INFOCOM.
[58] Hector Garcia-Molina,et al. Web Spam Taxonomy , 2005, AIRWeb.
[59] Boaz Ophir,et al. A Generic Form Processing Approach for Large Variant Templates , 2009, 2009 10th International Conference on Document Analysis and Recognition.
[60] Khaled Shaalan,et al. A Survey of Web Information Extraction Systems , 2006, IEEE Transactions on Knowledge and Data Engineering.
[61] Frederick E. Petry,et al. Regular language induction with genetic programming , 1994, Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence.
[62] Clement T. Yu,et al. Automatic integration of Web search interfaces with WISE-Integrator , 2004, The VLDB Journal.
[63] Sotiris Ioannidis,et al. Regular Expression Matching on Graphics Hardware for Intrusion Detection , 2009, RAID.