Classifying Short Unstructured Data Using the Apache Spark Platform
暂无分享,去创建一个
Edward A. Fox | Saurabh Chakravarty | Eric Williamson | Denilson Alves Pereira | Eduardo P. S. Castro | Eric R. Williamson | E. Fox | S. Chakravarty | D. Pereira | Eduardo P. S. Castro
[1] LEKHA R. NAIR,et al. STREAMING TWITTER DATA ANALYSIS USING SPARK FOR EFFECTIVE JOB SEARCH , 2015 .
[2] Tom White,et al. Hadoop: The Definitive Guide , 2009 .
[3] Patrick Wendell,et al. Learning Spark: Lightning-Fast Big Data Analytics , 2015 .
[4] Jennifer Widom,et al. Swoosh: a generic approach to entity resolution , 2008, The VLDB Journal.
[5] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[6] Henryk Maciejewski,et al. Distributed Classification of Text Documents on Apache Spark Platform , 2016, ICAISC.
[7] Nan Sun,et al. Exploiting internal and external semantics for the clustering of short texts using world knowledge , 2009, CIKM.
[8] W. Greene,et al. 计量经济分析 = Econometric analysis , 2009 .
[9] Shengrui Wang,et al. Automated feature weighting in naive bayes for high-dimensional data classification , 2012, CIKM.
[10] Yunming Ye,et al. Classifying Very High-Dimensional Data with Random Forests Built from Small Subspaces , 2012, Int. J. Data Warehous. Min..
[11] Lars George,et al. HBase: The Definitive Guide , 2011 .
[12] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[13] Geoffrey I. Webb,et al. Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.
[14] B. Carpenter. Lazy Sparse Stochastic Gradient Descent for Regularized Mutlinomial Logistic Regression , 2008 .
[15] Ahmed Ali Abdalla Esmin,et al. Disambiguating publication venue titles using association rules , 2014, IEEE/ACM Joint Conference on Digital Libraries.
[16] Ameet Talwalkar,et al. MLlib: Machine Learning in Apache Spark , 2015, J. Mach. Learn. Res..
[17] Hakan Ferhatosmanoglu,et al. Short text classification in twitter to improve information filtering , 2010, SIGIR.
[18] Ramakrishnan Srikant,et al. Fast algorithms for mining association rules , 1998, VLDB 1998.
[19] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.
[20] Heng Ji,et al. Harnessing web page directories for large-scale classification of tweets , 2013, WWW '13 Companion.
[21] Ricardo Baeza-Yates,et al. Modern Information Retrieval - the concepts and technology behind search, Second edition , 2011 .
[22] Andreas Thor,et al. Evaluation of entity resolution approaches on real-world match problems , 2010, Proc. VLDB Endow..
[23] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .
[24] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[25] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.
[26] Ashwin Machanavajjhala,et al. Entity Resolution: Theory, Practice & Open Challenges , 2012, Proc. VLDB Endow..
[27] Denilson Alves Pereira,et al. An association rules based method for classifying product offers from e-shopping , 2017, Intell. Data Anal..
[28] Ahmed K. Elmagarmid,et al. Duplicate Record Detection: A Survey , 2007, IEEE Transactions on Knowledge and Data Engineering.
[29] David A. Freedman,et al. Statistical Models: Theory and Practice: References , 2005 .
[30] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.