kNN-IS: An Iterative Spark-based design of the k-Nearest Neighbors classifier for big data
暂无分享,去创建一个
Francisco Herrera | Sergio Ramírez-Gallego | Isaac Triguero | Jesús Maillo | F. Herrera | S. Ramírez-Gallego | I. Triguero | Jesús Maillo
[1] Helmut Krcmar,et al. Big Data , 2014, Wirtschaftsinf..
[2] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .
[3] C. L. Philip Chen,et al. Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..
[4] Francisco Herrera,et al. ROSEFW-RF: The winner algorithm for the ECBDL'14 big data competition: An extremely imbalanced big data bioinformatics problem , 2015, Knowl. Based Syst..
[5] Tom White,et al. Hadoop: The Definitive Guide , 2009 .
[6] Joseph M. Hellerstein,et al. GraphLab: A New Framework For Parallel Machine Learning , 2010, UAI.
[7] Justine Rochas,et al. Solutions for Processing K Nearest Neighbor Joins for Massive Data on MapReduce , 2015, 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.
[8] Beng Chin Ooi,et al. Efficient Processing of k Nearest Neighbor Joins using MapReduce , 2012, Proc. VLDB Endow..
[9] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[10] RahimiAli,et al. Similarity-based Classification: Concepts and Algorithms , 2009 .
[11] Patrick Wendell,et al. Learning Spark: Lightning-Fast Big Data Analytics , 2015 .
[12] Juha Heinanen,et al. OF DATA INTENSIVE APPLICATIONS , 1986 .
[13] M. Kubát. An Introduction to Machine Learning , 2017, Springer International Publishing.
[14] Mustapha Lebbah,et al. Micro-Batching Growing Neural Gas for Clustering Data Streams Using Spark Streaming , 2015, INNS Conference on Big Data.
[16] Xindong Wu,et al. The Top Ten Algorithms in Data Mining , 2009 .
[17] Maya R. Gupta,et al. Similarity-based Classification: Concepts and Algorithms , 2009, J. Mach. Learn. Res..
[18] Jimmy J. Lin. MapReduce is Good Enough? If All You Have is a Hammer, Throw Away Everything That's Not a Nail! , 2012, Big Data.
[19] Xiao Qin,et al. A parallel algorithm for mining constrained frequent patterns using MapReduce , 2017, Soft Comput..
[20] Miriam A. M. Capretz,et al. Challenges for MapReduce in Big Data , 2014, 2014 IEEE World Congress on Services.
[21] GhemawatSanjay,et al. The Google file system , 2003 .
[22] Francisco Herrera,et al. On the choice of the best imputation methods for missing values considering three groups of classification methods , 2012, Knowledge and Information Systems.
[23] Francisco Herrera,et al. A MapReduce-Based k-Nearest Neighbor Approach for Big Data Classification , 2015, TrustCom 2015.
[24] María José del Jesús,et al. Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks , 2014, WIREs Data Mining Knowl. Discov..
[25] G. Amdhal,et al. Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).
[26] Ian H. Witten,et al. Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.
[27] Ian Witten,et al. Data Mining , 2000 .
[28] Kunle Olukotun,et al. Map-Reduce for Machine Learning on Multicore , 2006, NIPS.
[29] G. Priya,et al. EFFICIENT KNN CLASSIFICATION ALGORITHM FOR BIG DATA , 2017 .
[30] LeeWang-Chien,et al. Distributed In-Memory Processing of All k Nearest Neighbor Queries , 2016 .
[31] Peter E. Hart,et al. Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.
[32] C. Lynch. Big data: How do your data grow? , 2008, Nature.
[33] Ashwin Srinivasan,et al. Data and task parallelism in ILP using MapReduce , 2011, Machine Learning.
[34] Joseph E. Gonzalez,et al. GraphLab: A New Parallel Framework for Machine Learning , 2010 .
[35] Christina Freytag,et al. Using Mpi Portable Parallel Programming With The Message Passing Interface , 2016 .
[36] Ho-Hyun Park,et al. Tagging and classifying facial images in cloud environments based on KNN using MapReduce , 2015 .
[37] Geoffrey C. Fox,et al. Investigation of Data Locality in MapReduce , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).
[38] Feifei Li,et al. Efficient parallel kNN joins for large data in MapReduce , 2012, EDBT '12.
[39] Francisco Herrera,et al. Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study , 2015, Knowledge and Information Systems.
[40] Francisco Herrera,et al. MRPR: A MapReduce solution for prototype reduction in big data classification , 2015, Neurocomputing.
[41] Evaggelia Pitoura,et al. Distributed In-Memory Processing of All k Nearest Neighbor Queries , 2016, IEEE Transactions on Knowledge and Data Engineering.
[42] Michael D. Ernst,et al. HaLoop , 2010, Proc. VLDB Endow..
[43] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[44] Michael Minelli,et al. Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses , 2012 .
[45] Kilian Q. Weinberger,et al. Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.
[46] ZhangShichao,et al. Efficient kNN classification algorithm for big data , 2016 .
[47] Juan José Rodríguez Diez,et al. Instance selection of linear complexity for big data , 2016, Knowl. Based Syst..
[48] William Gropp,et al. Skjellum using mpi: portable parallel programming with the message-passing interface , 1994 .
[49] Jesús Alcalá-Fdez,et al. KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..
[50] Hongjie Jia,et al. Study on density peaks clustering based on k-nearest neighbors and principal component analysis , 2016, Knowl. Based Syst..
[51] Sunil Arya,et al. An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.