Feature Extraction and Duplicate Detection for Text Mining : A Survey
暂无分享,去创建一个
[1] Yiming Yang,et al. A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.
[2] Mika Klemettinen,et al. Applying data mining techniques for descriptive phrase extraction in digital document collections , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.
[3] Kyuseok Shim,et al. SPIRIT: Sequential Pattern Mining with Regular Expression Constraints , 1999, VLDB.
[4] Jian Pei,et al. CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.
[5] Umeshwar Dayal,et al. FreeSpan: frequent pattern-projected sequential pattern mining , 2000, KDD '00.
[6] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.
[7] Xifeng Yan,et al. CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.
[8] Dmitriy Fradkin,et al. Experiments with random projections for machine learning , 2003, KDD '03.
[9] Mohammed J. Zaki,et al. SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.
[10] K.R. Venugopal,et al. Generic Feature Extraction for Classification using Fuzzy C - Means Clustering , 2005, 2005 3rd International Conference on Intelligent Sensing and Information Processing.
[11] Nikhil R. Pal,et al. Genetic programming for simultaneous feature selection and classifier design , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[12] Felix Naumann,et al. Data Fusion in Three Steps: Resolving Schema, Tuple, and Value Inconsistencies , 2006, IEEE Data Eng. Bull..
[13] Eduardo Gasca,et al. Eliminating redundancy and irrelevance using a new MLP-based feature selection method , 2006, Pattern Recognit..
[14] Sreeram Ramakrishnan,et al. A hybrid approach for feature subset selection using neural networks and ant colony optimization , 2007, Expert Syst. Appl..
[15] Ahmed K. Elmagarmid,et al. Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.
[16] Christopher C. Yang,et al. Mining related queries from Web search engine query logs using an improved association rule mining model , 2007, J. Assoc. Inf. Sci. Technol..
[17] Stephen Lin,et al. Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[18] Ram Akella,et al. Active relevance feedback for difficult queries , 2008, CIKM '08.
[19] Daisy Zhe Wang,et al. WebTables: exploring the power of tables on the web , 2008, Proc. VLDB Endow..
[20] Daniel Sánchez,et al. Text Knowledge Mining: An Alternative to Text Data Mining , 2008, 2008 IEEE International Conference on Data Mining Workshops.
[21] Stephen E. Robertson,et al. Selecting good expansion terms for pseudo-relevance feedback , 2008, SIGIR '08.
[22] Renée J. Miller,et al. Framework for Evaluating Clustering Algorithms in Duplicate Detection , 2009, Proc. VLDB Endow..
[23] K. G. Srinivasa,et al. Soft Computing for Data Mining Applications , 2009, Studies in Computational Intelligence.
[24] Nasser Ghasem-Aghaee,et al. Text feature selection using ant colony optimization , 2009, Expert Syst. Appl..
[25] Renée J. Miller,et al. Creating probabilistic databases from duplicated data , 2009, The VLDB Journal.
[26] Felix Naumann,et al. Data fusion , 2009, CSUR.
[27] Hiroshi Ogura,et al. Feature selection with a measure of deviations from Poisson in text categorization , 2009, Expert Syst. Appl..
[28] S. Handschuh,et al. Visual abstraction and ordering in faceted browsing of text collections , 2010 .
[29] Antoon Bronselaer,et al. Aspects of object merging , 2010, 2010 Annual Meeting of the North American Fuzzy Information Processing Society.
[30] Gurpreet Singh Lehal,et al. A Survey of Text Summarization Extractive Techniques , 2010 .
[31] Deng Cai,et al. Unsupervised feature selection for multi-cluster data , 2010, KDD.
[32] Xindong Wu,et al. Keyphrase extraction based on semantic relatedness , 2010, 9th IEEE International Conference on Cognitive Informatics (ICCI'10).
[33] Sandhya Joshi,et al. Classification of Alzheimer's Disease and Parkinson's Disease by Using Machine Learning and Neural Network Methods , 2010, 2010 Second International Conference on Machine Learning and Computing.
[34] Yue Xu,et al. Selected new training documents to update user profile , 2010, CIKM.
[35] Sougata Mukherjea,et al. Faceted search and browsing of audio content on spoken web , 2010, CIKM.
[36] H. S. Dhami,et al. Text Summarization for Information Retrieval using Pattern Recognition Techniques , 2011 .
[37] Zi Huang,et al. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence ℓ2,1-Norm Regularized Discriminative Feature Selection for Unsupervised Learning , 2022 .
[38] Sanjay Chawla,et al. Robust Record Linkage Blocking Using Suffix Arrays and Bloom Filters , 2011, TKDD.
[39] Xiaojun Wu,et al. Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[40] L M Patnaik,et al. Classification of email using BeaKS: Behavior and keyword stemming , 2011, TENCON 2011 - 2011 IEEE Region 10 Conference.
[41] Marianne Winslett,et al. Using structural information in XML keyword search effectively , 2011, TODS.
[42] J. Jebamalar Tamilselvi,et al. Handling Duplicate Data in Data Warehouse for Data Mining , 2011 .
[43] Ryen W. White,et al. Modeling and analysis of cross-session search tasks , 2011, SIGIR.
[44] Panayiotis Tsaparas,et al. Facet discovery for structured web search: a query-log mining approach , 2011, SIGMOD '11.
[45] Yuefeng Li,et al. Effective Pattern Discovery for Text Mining , 2012, IEEE Transactions on Knowledge and Data Engineering.
[46] Nouman Azam,et al. Comparison of term frequency and document frequency based feature selection metrics in text categorization , 2012, Expert Syst. Appl..
[47] Felix Naumann,et al. Adaptive Windows for Duplicate Detection , 2012, 2012 IEEE 28th International Conference on Data Engineering.
[48] Özgür Ulusoy,et al. Static index pruning in web search engines: Combining term and document popularities with query views , 2012, TOIS.
[49] Yi Chen,et al. Differentiating search results on structured data , 2012, TODS.
[50] Huan Liu,et al. Unsupervised feature selection for linked social media data , 2012, KDD.
[51] Monika Henzinger,et al. On Multiple Keyword Sponsored Search Auctions with Budgets , 2012, ICALP.
[52] Axel Schulz,et al. I See a Car Crash: Real-Time Detection of Small Scale Incidents in Microblogs , 2013, ESWC.
[53] Davide Martinenghi,et al. Top-k diversity queries over bounded regions , 2013, TODS.
[54] A. Salinger,et al. Efficient Fuzzy Search in Large Text Collections , 2013 .
[55] Fabrizio Silvestri,et al. Discovering tasks from search engine query logs , 2013, TOIS.
[56] Lei Wang,et al. On Similarity Preserving Feature Selection , 2013, IEEE Transactions on Knowledge and Data Engineering.
[57] Gonzalo Navarro,et al. Spaces, Trees, and Colors , 2013, ACM Comput. Surv..
[58] Wei Chu,et al. Learning to extract cross-session search tasks , 2013, WWW.
[59] Sherif Sakr,et al. The family of mapreduce and large-scale data processing systems , 2013, CSUR.
[60] Xiao Qin,et al. Interrelation analysis of celestial spectra data using constrained frequent pattern trees , 2013, Knowl. Based Syst..
[61] Roque Marín,et al. ClaSP: An Efficient Algorithm for Mining Frequent Closed Sequences , 2013, PAKDD.
[62] Hector Garcia-Molina,et al. Pay-As-You-Go Entity Resolution , 2013, IEEE Transactions on Knowledge and Data Engineering.
[63] Beng Chin Ooi,et al. Distributed data management using MapReduce , 2014, CSUR.
[64] James Allan,et al. Extending Faceted Search to the General Web , 2014, CIKM.
[65] Carlo Zaniolo,et al. Harvesting Domain Specific Ontologies from Text , 2014, 2014 IEEE International Conference on Semantic Computing.
[66] Swapan K. Parui,et al. Incremental blind feedback , 2014, ACM Trans. Asian Lang. Inf. Process..
[67] Jing Zhang,et al. An Efficient Algorithm of Frequent Itemsets Mining Based on MapReduce , 2014 .
[68] Yogesh R. Shepal. A Fast Clustering-Based Feature Subset Selection Algorithm for High Dimensional Data , 2014 .
[69] Luis Miguel Bergasa,et al. Text Detection and Recognition on Traffic Panels From Street-Level Imagery Using Visual Appearance , 2014, IEEE Transactions on Intelligent Transportation Systems.
[70] Dong Xu,et al. Semi-Supervised Heterogeneous Fusion for Multimedia Data Co-Clustering , 2014, IEEE Transactions on Knowledge and Data Engineering.
[71] Dan Suciu,et al. Query-Based Data Pricing , 2015, J. ACM.
[72] Soo-Hyung Kim,et al. Automatic extraction of text regions from document images by multilevel thresholding and k-means clustering , 2015, 2015 IEEE/ACIS 14th International Conference on Computer and Information Science (ICIS).
[73] NIDHI TIWARI,et al. Classification Framework of MapReduce Scheduling Algorithms , 2015, ACM Comput. Surv..
[74] Mudhakar Srivatsa,et al. Fine-Grained Knowledge Sharing in Collaborative Environments , 2015, IEEE Transactions on Knowledge and Data Engineering.
[75] Felix Naumann,et al. Progressive Duplicate Detection , 2015, IEEE Transactions on Knowledge and Data Engineering.
[76] R. B. V. Subramanyam,et al. Mining Interesting Infrequent Itemsets from Very Large Data based on MapReduce Framework , 2015 .
[77] Farooque Azam,et al. Innovative Windows for Duplicate Detection , 2015 .
[78] Vijay Kumar Verma,et al. Text mining and information professionals: Role, issues and challenges , 2015, 2015 4th International Symposium on Emerging Trends and Technologies in Libraries and Information Services.
[79] Xiaojun Wan,et al. PPSGen: Learning-Based Presentation Slides Generation for Academic Papers , 2015, IEEE Transactions on Knowledge and Data Engineering.
[80] Panayiotis Tsaparas,et al. Review Selection Using Micro-Reviews , 2015, IEEE Transactions on Knowledge and Data Engineering.
[81] S. Sitharama Iyengar,et al. Query Click and Text Similarity Graph for Query Suggestions , 2015, MLDM.
[82] Kun Zhou,et al. Exploring Topical Lead-Lag across Corpora , 2015, IEEE Transactions on Knowledge and Data Engineering.
[83] Feiping Nie,et al. Feature Selection via Global Redundancy Minimization , 2015, IEEE Transactions on Knowledge and Data Engineering.
[84] Suh-Yin Lee,et al. Mining Temporal Patterns in Time Interval-Based Data , 2015, IEEE Transactions on Knowledge and Data Engineering.
[85] Beng Chin Ooi,et al. Efficient Processing of Spatial Group Keyword Queries , 2015, TODS.
[86] Antoon Bronselaer,et al. Propagation of Data Fusion , 2015, IEEE Transactions on Knowledge and Data Engineering.
[87] A. Akilan,et al. Text mining: Challenges and future directions , 2015, 2015 2nd International Conference on Electronics and Communication Systems (ICECS).
[88] Eleonora D'Andrea,et al. Real-Time Detection of Traffic From Twitter Stream Analysis , 2015, IEEE Transactions on Intelligent Transportation Systems.
[89] Purushothama Raju,et al. Mining Closed Sequential Patterns in Large Sequence Databases , 2015 .
[90] Di Jiang,et al. Cross-Lingual Topic Discovery From Multilingual Search Engine Query Log , 2016, ACM Trans. Inf. Syst..
[91] S. S. Iyengar,et al. USER FEEDBACK SESSION WITH CLICKED AND UNCLICKED DOCUMENTS FOR RELATED SEARCH RECOMMENDATION , 2016 .
[92] Chunxiao Jiang,et al. Microblog Dimensionality Reduction—A Deep Learning Approach , 2016, IEEE Transactions on Knowledge and Data Engineering.
[93] E. Medvet,et al. Inference of Regular Expressions for Text Extraction from Examples , 2016, IEEE Transactions on Knowledge and Data Engineering.
[94] Donald E. Brown,et al. Text Mining the Contributors to Rail Accidents , 2016, IEEE Transactions on Intelligent Transportation Systems.
[95] Yueting Zhuang,et al. Graph Regularized Feature Selection with Data Reconstruction , 2016, IEEE Transactions on Knowledge and Data Engineering.
[96] Laliteshwari,et al. Relevance Feature Discovery for Text Mining , 2016 .
[97] Dieter Pfoser,et al. Efficient Processing of Relevant Nearest-Neighbor Queries , 2016, TSAS.
[98] Robert G. Capra,et al. The Effects of Aggregated Search Coherence on Search Behavior , 2016, ACM Trans. Inf. Syst..
[99] Xuemin Lin,et al. Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search , 2016, IEEE Transactions on Knowledge and Data Engineering.
[100] Jifu Zhang,et al. FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.