On the use of MapReduce for imbalanced big data using Random Forest
暂无分享,去创建一个
Francisco Herrera | José Manuel Benítez | Victoria López | Sara del Río | J. M. Benítez | F. Herrera | S. Río | Victoria López
[1] Sung-Kwun Oh,et al. The design of polynomial function-based neural network predictors for detection of software defects , 2013, Inf. Sci..
[2] Tom White,et al. Hadoop: The Definitive Guide , 2009 .
[3] María José del Jesús,et al. A hierarchical genetic fuzzy system based on genetic programming for addressing classification with highly imbalanced and borderline data-sets , 2013, Knowl. Based Syst..
[4] Francisco Herrera,et al. A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[5] Anil K. Jain,et al. Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners , 1991, IEEE Trans. Pattern Anal. Mach. Intell..
[6] Alekh Jindal,et al. Hadoop++ , 2010 .
[7] John Langford,et al. Cost-sensitive learning by cost-proportionate example weighting , 2003, Third IEEE International Conference on Data Mining.
[8] Mikel Galar,et al. Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches , 2013, Knowl. Based Syst..
[9] R. Barandelaa,et al. Strategies for learning in class imbalance problems , 2003, Pattern Recognit..
[10] Helmut Krcmar,et al. Big Data , 2014, Wirtschaftsinf..
[11] Zheng Shao,et al. Data warehousing and analytics infrastructure at facebook , 2010, SIGMOD Conference.
[12] Taghi M. Khoshgoftaar,et al. An empirical study of the classification performance of learners on imbalanced and noisy software quality data , 2014, Inf. Sci..
[13] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[14] Francisco Herrera,et al. A unifying view on dataset shift in classification , 2012, Pattern Recognit..
[15] Xue-wen Chen,et al. Combating the Small Sample Class Imbalance Problem Using Feature Selection , 2010, IEEE Transactions on Knowledge and Data Engineering.
[16] Leo Breiman,et al. Classification and Regression Trees , 1984 .
[17] Jorma Laurikkala,et al. Improving Identification of Difficult Small Classes by Balancing Class Distribution , 2001, AIME.
[18] Antanas Verikas,et al. Mining data with random forests: A survey and results of new tests , 2011, Pattern Recognit..
[19] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[20] Vasile Palade,et al. Adjusted Geometric-mean: a Novel Performance Measure for Imbalanced Bioinformatics Datasets Learning , 2012, J. Bioinform. Comput. Biol..
[21] Hui Han,et al. Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.
[22] I. Tomek,et al. Two Modifications of CNN , 1976 .
[23] Mukesh K. Mohania,et al. Cloud Computing and Big Data Analytics: What Is New from Databases Perspective? , 2012, BDA.
[24] David A. Cieslak,et al. Automatically countering imbalance and its empirical relationship to cost , 2008, Data Mining and Knowledge Discovery.
[25] Ester Bernadó-Mansilla,et al. Evolutionary rule-based systems for imbalanced data sets , 2008, Soft Comput..
[26] Jing Zhao,et al. ACOSampling: An ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data , 2013, Neurocomputing.
[27] Ronen Feldman,et al. The Data Mining and Knowledge Discovery Handbook , 2005 .
[28] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..
[29] Michael Minelli,et al. Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses , 2012 .
[30] Szymon Wilk,et al. Learning from Imbalanced Data in Presence of Noisy and Borderline Examples , 2010, RSCTC.
[31] Gustavo E. A. P. A. Batista,et al. A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.
[32] Gary M. Weiss. The Impact of Small Disjuncts on Classifier Learning , 2010, Data Mining.
[33] Charles Elkan,et al. The Foundations of Cost-Sensitive Learning , 2001, IJCAI.
[34] Yang Wang,et al. Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..
[35] José Salvador Sánchez,et al. On the k-NN performance in a challenging scenario of imbalance and overlapping , 2008, Pattern Analysis and Applications.
[36] Andrew K. C. Wong,et al. Classification of Imbalanced Data: a Review , 2009, Int. J. Pattern Recognit. Artif. Intell..
[37] Misha Denil,et al. Overlap versus Imbalance , 2010, Canadian Conference on AI.
[38] Francisco Herrera,et al. Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics , 2012, Expert Syst. Appl..
[39] Bianca Zadrozny,et al. Learning and making decisions when costs and probabilities are both unknown , 2001, KDD '01.
[40] Francisco Herrera,et al. Evolutionary-based selection of generalized instances for imbalanced classification , 2012, Knowl. Based Syst..
[42] Alex Alves Freitas,et al. Coping with Unbalanced Class Data Sets in Oral Absorption Models , 2013, J. Chem. Inf. Model..
[43] Gary M. Weiss. Mining with Rare Cases , 2010, Data Mining and Knowledge Discovery Handbook.
[44] Chumphol Bunkhumpornpat,et al. Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem , 2009, PAKDD.
[45] Haibo He,et al. Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.
[46] Pedro M. Domingos. MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.
[47] Yi-Ping Phoebe Chen,et al. Computational intelligence for heart disease diagnosis: A medical knowledge driven approach , 2013, Expert Syst. Appl..
[48] Donald. Miner,et al. MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems , 2012 .
[49] Nathalie Japkowicz,et al. The class imbalance problem: A systematic study , 2002, Intell. Data Anal..
[50] Ligang Zhou,et al. Performance of corporate bankruptcy prediction models on imbalanced dataset: The effect of sampling methods , 2013, Knowl. Based Syst..
[51] Sean Owen,et al. Mahout in Action , 2011 .
[52] Francisco Herrera,et al. An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics , 2013, Inf. Sci..
[53] B. C. Brookes,et al. Information Sciences , 2020, Cognitive Skills You Need for the 21st Century.
[54] Francisco Herrera,et al. Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data , 2015, Fuzzy Sets Syst..
[55] Stan Matwin,et al. Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.
[56] Chao Chen,et al. Using Random Forest to Learn Imbalanced Data , 2004 .