Empirical Evaluation of Map Reduce Based Hybrid Approach for Problem of Imbalanced Classification in Big Data

[1]  María José del Jesús,et al.  A View on Fuzzy Systems for Big Data: Progress and Opportunities , 2016, Int. J. Comput. Intell. Syst..

[2]  Young-Im Cho,et al.  Integrating of Data Using the Hadoop and R , 2015, FNC/MobiSPC.

[3]  Ying Ju,et al.  Finding the Best Classification Threshold in Imbalanced Classification , 2016, Big Data Res..

[4]  Bartosz Krawczyk,et al.  Learning from imbalanced data: open challenges and future directions , 2016, Progress in Artificial Intelligence.

[5]  Nathalie Japkowicz,et al.  Boosting support vector machines for imbalanced data sets , 2008, Knowledge and Information Systems.

[6]  Yaoliang Yu,et al.  Petuum: A New Platform for Distributed Machine Learning on Big Data , 2015, IEEE Trans. Big Data.

[7]  Zhang Chunkai,et al.  A new sampling approach for classification of imbalanced data sets with high density , 2014, 2014 International Conference on Big Data and Smart Computing (BIGCOMP).

[8]  George K. Karagiannidis,et al.  Efficient Machine Learning for Big Data: A Review , 2015, Big Data Res..

[9]  Ching-Hsien Hsu,et al.  Locality and loading aware virtual machine mapping techniques for optimizing communications in MapReduce applications , 2015, Future Gener. Comput. Syst..

[10]  Nilanjan Dey,et al.  A MapReduce approach to diminish imbalance parameters for big deoxyribonucleic acid dataset , 2016, Comput. Methods Programs Biomed..

[11]  Francisco Herrera,et al.  Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data , 2015, Fuzzy Sets Syst..

[12]  Jian Pei,et al.  Classification: Basic Concepts , 2012 .

[13]  Athanasios V. Vasilakos,et al.  Big data analytics: a survey , 2015, Journal of Big Data.

[14]  Yang Wang,et al.  An Effective Integrated Method for Learning Big Imbalanced Data , 2014, 2014 IEEE International Congress on Big Data.

[15]  Francisco Herrera,et al.  Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets , 2016, Inf. Sci..

[16]  Yonggang Wen,et al.  Toward Scalable Systems for Big Data Analytics: A Technology Tutorial , 2014, IEEE Access.

[17]  Francisco Herrera,et al.  Fuzzy rough classifiers for class imbalanced multi-instance data , 2016, Pattern Recognit..

[18]  Dorit S. Hochbaum,et al.  Sparse computation for large-scale data mining , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[19]  Durgaprasad Gangodkar,et al.  Hadoop, MapReduce and HDFS: A Developers Perspective☆ , 2015 .

[20]  Francesco Marcelloni,et al.  A MapReduce solution for associative classification of big data , 2016, Inf. Sci..

[21]  Stephen Kwek,et al.  Applying Support Vector Machines to Imbalanced Datasets , 2004, ECML.

[22]  Yanqing Zhang,et al.  SVMs Modeling for Highly Imbalanced Classification , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[23]  Vasile Palade,et al.  Class Imbalance Learning Methods for Support Vector Machines , 2013 .

[24]  Francisco Herrera,et al.  On the use of MapReduce for imbalanced big data using Random Forest , 2014, Inf. Sci..

[25]  Rajiv Pandey,et al.  Quantitative Evaluation of Big Data Categorical Variables through R , 2015 .

[26]  Taghi M. Khoshgoftaar,et al.  A survey of open source tools for machine learning with big data in the Hadoop ecosystem , 2015, Journal of Big Data.

[27]  James A. Rodger,et al.  Discovery of medical Big Data analytics: Improving the prediction of traumatic brain injury survival rates by data mining Patient Informatics Processing Software Hybrid Hadoop Hive , 2015 .

[28]  Ching-Hsien Hsu,et al.  An Adaptive and Memory Efficient Sampling Mechanism for Partitioning in MapReduce , 2015, International Journal of Parallel Programming.

[29]  Francisco Herrera,et al.  A Compact Evolutionary Interval-Valued Fuzzy Rule-Based Classification System for the Modeling and Prediction of Real-World Financial Applications With Imbalanced Data , 2015, IEEE Transactions on Fuzzy Systems.

[30]  Francisco Herrera,et al.  MRPR: A MapReduce solution for prototype reduction in big data classification , 2015, Neurocomputing.

[31]  Khaled Belkadi,et al.  Parallel Distributed Patterns Mining Using Hadoop MapReduce Framework , 2017, Int. J. Grid High Perform. Comput..

[32]  Francisco Herrera,et al.  kNN-IS: An Iterative Spark-based design of the k-Nearest Neighbors classifier for big data , 2017, Knowl. Based Syst..

[33]  S. D. Madhu Kumar,et al.  Improving execution speed of incremental runs of MapReduce using provenance , 2017, Int. J. Big Data Intell..

[34]  Francisco Herrera,et al.  An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics , 2013, Inf. Sci..

[35]  Yu-Lin He,et al.  Learning ELM-Tree from big data based on uncertainty reduction , 2015, Fuzzy Sets Syst..

[36]  Francisco Herrera,et al.  Evolutionary undersampling for extremely imbalanced big data classification under apache spark , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[37]  Yang Liu,et al.  Short-Term Load Forecasting Based on Big Data Technologies , 2014, CIT 2014.

[38]  Simon Fong,et al.  Incrementally optimized decision tree for noisy big data , 2012, BigMine '12.

[39]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[40]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[41]  Francisco Herrera,et al.  ROSEFW-RF: The winner algorithm for the ECBDL'14 big data competition: An extremely imbalanced big data bioinformatics problem , 2015, Knowl. Based Syst..

[42]  Fuzhen Zhuang,et al.  Parallel sampling from big data with uncertainty distribution , 2015, Fuzzy Sets Syst..