Big Data: Tutorial and guidelines on information and process fusion for analytics algorithms with MapReduce

[1]  Andrew Musselman Apache Mahout , 2019, Encyclopedia of Big Data Technologies.

[2]  Fabian Hueske,et al.  Apache Flink , 2019, Encyclopedia of Big Data Technologies.

[3]  Verónica Bolón-Canedo,et al.  An Information Theory-Based Feature Selection Framework for Big Data Under Apache Spark , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[4]  Deepak Choudhary,et al.  Internet of things: A survey on enabling technologies, application and standardization , 2018 .

[5]  João Gama,et al.  Ensemble learning for data stream analysis: A survey , 2017, Inf. Fusion.

[6]  Shiliang Sun,et al.  A review of natural language processing techniques for opinion mining systems , 2017, Inf. Fusion.

[7]  Min Chen,et al.  Disease Prediction by Machine Learning Over Big Data From Healthcare Communities , 2017, IEEE Access.

[8]  Francisco Herrera,et al.  A comparison on scalability for batch big data processing on Apache Spark and Apache Flink , 2017 .

[9]  Francisco Herrera,et al.  An insight into imbalanced Big Data classification: outcomes and challenges , 2017 .

[10]  Verónica Bolón-Canedo,et al.  Fast‐mRMR: Fast Minimum Redundancy Maximum Relevance Algorithm for High‐Dimensional Big Data , 2017, Int. J. Intell. Syst..

[11]  Francisco Herrera,et al.  kNN-IS: An Iterative Spark-based design of the k-Nearest Neighbors classifier for big data , 2017, Knowl. Based Syst..

[12]  Min Chen Welcome to the New Interdisciplinary Journal Combining Big Data and Cognitive Computing , 2017, Big Data Cogn. Comput..

[13]  Min Chen,et al.  Big-Data Analytics for Cloud, IoT and Cognitive Computing , 2017 .

[14]  S. R,et al.  Data Mining with Big Data , 2017, 2017 11th International Conference on Intelligent Systems and Control (ISCO).

[15]  Hing Kai Chan,et al.  Recent Development in Big Data Analytics for Business Operations and Risk Management , 2017, IEEE Transactions on Cybernetics.

[16]  Verónica Bolón-Canedo,et al.  An Information Theoretic Feature Selection Framework for Big Data under Apache Spark , 2016, ArXiv.

[17]  Victor Chang,et al.  A review and future direction of agile, business intelligence, analytics and data science , 2016, Int. J. Inf. Manag..

[18]  María José del Jesús,et al.  A View on Fuzzy Systems for Big Data: Progress and Opportunities , 2016, Int. J. Comput. Intell. Syst..

[19]  Roger H. L. Chiang,et al.  Big Data Research in Information Systems: Toward an Inclusive Research Agenda , 2016, J. Assoc. Inf. Syst..

[20]  Jason J. Jung,et al.  Social big data: Recent achievements and new challenges , 2015, Information Fusion.

[21]  Ameet Talwalkar,et al.  MLlib: Machine Learning in Apache Spark , 2015, J. Mach. Learn. Res..

[22]  Dr. Tamanna Siddiqui,et al.  BIG DATA ANALYTICS ON THE CLOUD , 2016 .

[23]  Lior Rokach,et al.  Decision forest: Twenty years of research , 2016, Inf. Fusion.

[24]  Cheng Soon Ong,et al.  Multivariate spearman's ρ for aggregating ranks using copulas , 2016 .

[25]  Jorge A. Balazs,et al.  Opinion Mining and Information Fusion: A survey , 2016, Inf. Fusion.

[26]  Jing Zhang,et al.  Parallel information fusion method for microarray data analysis , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[27]  Francisco Herrera,et al.  An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species , 2015, BioMed research international.

[28]  Mohsen Guizani,et al.  Internet of Things: A Survey on Enabling Technologies, Protocols, and Applications , 2015, IEEE Communications Surveys & Tutorials.

[29]  Murtaza Haider,et al.  Beyond the hype: Big data concepts, methods, and analytics , 2015, Int. J. Inf. Manag..

[30]  Patrick Wendell,et al.  Learning Spark: Lightning-Fast Big Data Analytics , 2015 .

[31]  Francisco Herrera,et al.  MRPR: A MapReduce solution for prototype reduction in big data classification , 2015, Neurocomputing.

[32]  Francisco Herrera,et al.  A MapReduce Approach to Address Big Data Classification Problems Based on the Fusion of Linguistic Fuzzy Rules , 2015, Int. J. Comput. Intell. Syst..

[33]  Gang Li,et al.  Big data related technologies, challenges and future prospects , 2015, J. Inf. Technol. Tour..

[34]  Francisco Herrera,et al.  Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data , 2015, Fuzzy Sets Syst..

[35]  Francisco Herrera,et al.  On the use of MapReduce for imbalanced big data using Random Forest , 2014, Inf. Sci..

[36]  Bin Wu,et al.  Parallelization of ontology construction and fusion based on MapReduce , 2014, 2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems.

[37]  Helmut Krcmar,et al.  Big Data , 2014, Wirtschaftsinf..

[38]  Thomas Hofmann,et al.  Communication-Efficient Distributed Dual Coordinate Ascent , 2014, NIPS.

[39]  María José del Jesús,et al.  Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks , 2014, WIREs Data Mining Knowl. Discov..

[40]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[41]  Yonggang Wen,et al.  Toward Scalable Systems for Big Data Analytics: A Technology Tutorial , 2014, IEEE Access.

[42]  Felix Naumann,et al.  The Stratosphere platform for big data analytics , 2014, The VLDB Journal.

[43]  P. Baldi,et al.  Searching for exotic particles in high-energy physics with deep learning , 2014, Nature Communications.

[44]  Babita Gupta,et al.  The Current State of Business Intelligence in Academia: The Arrival of Big Data , 2014, CAIS.

[45]  Laurence T. Yang,et al.  Data Mining for Internet of Things: A Survey , 2014, IEEE Communications Surveys & Tutorials.

[46]  Limsoon Wong,et al.  Random forests on Hadoop for genome-wide association studies of multivariate neuroimaging phenotypes , 2013, BMC Bioinformatics.

[47]  Gilles Louppe,et al.  Independent consultant , 2013 .

[48]  Joaquim Assunção,et al.  Distributed Stochastic Aware Random Forests -- Efficient Data Mining for Big Data , 2013, 2013 IEEE International Congress on Big Data.

[49]  R. Khan,et al.  Use of DAG in Distributed Parallel Computing , 2013 .

[50]  Michael Minelli,et al.  Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses , 2012 .

[51]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[52]  Indranil Palit,et al.  Scalable and Parallel Boosting with MapReduce , 2012, IEEE Transactions on Knowledge and Data Engineering.

[53]  Astrid Rheinländer,et al.  Opening the Black Boxes in Data Flow Optimization , 2012, Proc. VLDB Endow..

[54]  Volker Markl,et al.  Spinning Fast Iterative Data Flows , 2012, Proc. VLDB Endow..

[55]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[56]  Yon Dohn Chung,et al.  Parallel data processing with MapReduce: a survey , 2012, SGMD.

[57]  Francisco Herrera,et al.  A Taxonomy and Experimental Study on Prototype Generation for Nearest Neighbor Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[58]  Sean Owen,et al.  Mahout in Action , 2011 .

[59]  Chuck Lam,et al.  Hadoop in Action , 2010 .

[60]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[61]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[62]  Sanjay Ghemawat,et al.  MapReduce: a flexible data processing tool , 2010, CACM.

[63]  Klaus Nordhausen,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman , 2009 .

[64]  Qing He,et al.  Parallel K-Means Clustering Based on MapReduce , 2009, CloudCom.

[65]  Roberto J. Bayardo,et al.  PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce , 2009, Proc. VLDB Endow..

[66]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[67]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[68]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[69]  Ludmila I. Kuncheva Diversity in multiple classifier systems , 2005, Inf. Fusion.

[70]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[71]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[72]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[73]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[74]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[75]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[76]  Juha Heinanen,et al.  OF DATA INTENSIVE APPLICATIONS , 1986 .