Severely imbalanced Big Data challenges: investigating data sampling approaches
暂无分享,去创建一个
Taghi M. Khoshgoftaar | Richard A. Bauder | Tawfiq Hasanin | Joffrey L. Leevy | Richard A. Bauder | T. Khoshgoftaar | Tawfiq Hasanin
[1] Francisco Herrera,et al. On the use of MapReduce for imbalanced big data using Random Forest , 2014, Inf. Sci..
[2] Taghi M. Khoshgoftaar,et al. Detecting Slow HTTP POST DoS Attacks Using Netflow Features , 2019, FLAIRS.
[3] Angappa Gunasekaran,et al. Big Data in Healthcare Management: A Review of Literature , 2018 .
[4] Alois Knoll,et al. Gradient boosting machines, a tutorial , 2013, Front. Neurorobot..
[5] Siti Mariyam Shamsuddin,et al. Classification with class imbalance problem: A review , 2015, SOCO 2015.
[6] Hairong Kuang,et al. The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).
[7] Fernando Nogueira,et al. Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..
[8] Julian D Olden,et al. Machine Learning Methods Without Tears: A Primer for Ecologists , 2008, The Quarterly Review of Biology.
[9] Taghi M. Khoshgoftaar,et al. Comparison of Data Sampling Approaches for Imbalanced Bioinformatics Data , 2014, FLAIRS.
[10] Seong-hun Park,et al. Highway traffic accident prediction using VDS big data analysis , 2016, The Journal of Supercomputing.
[11] Tom White,et al. Hadoop: The Definitive Guide , 2009 .
[12] J. Alberto Espinosa,et al. Big Data: Issues and Challenges Moving Forward , 2013, 2013 46th Hawaii International Conference on System Sciences.
[13] Carlo Curino,et al. Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.
[14] Valeria Vitelli,et al. Probabilistic preference learning with the Mallows rank model , 2014, J. Mach. Learn. Res..
[15] Yu-hua Liu,et al. A DoS attack situation assessment method based on QoS , 2011, Proceedings of 2011 International Conference on Computer Science and Network Technology.
[16] J. Galindo,et al. Credit Risk Assessment Using Statistical and Machine Learning: Basic Methodology and Risk Modeling Applications , 2000 .
[17] Chad Calvert,et al. Detection of Slowloris Attacks Using Netflow Traffic , 2018 .
[18] S. Cessie,et al. Ridge Estimators in Logistic Regression , 1992 .
[19] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..
[20] Hui Han,et al. Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.
[21] Taghi M. Khoshgoftaar,et al. A survey on addressing high-class imbalance in big data , 2018, Journal of Big Data.
[22] Seong-hun Park,et al. Large Imbalance Data Classification Based on MapReduce for Traffic Accident Prediction , 2014, 2014 Eighth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing.
[23] J. Hess,et al. Analysis of variance , 2018, Transfusion.
[24] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .
[25] Charles X. Ling,et al. Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.
[26] J. Hanley,et al. The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.
[27] Francisco Herrera,et al. ROSEFW-RF: The winner algorithm for the ECBDL'14 big data competition: An extremely imbalanced big data bioinformatics problem , 2015, Knowl. Based Syst..
[28] Taghi M. Khoshgoftaar,et al. A Study on the Relationships of Classifier Performance Metrics , 2009, 2009 21st IEEE International Conference on Tools with Artificial Intelligence.
[29] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[30] Chih-Fong Tsai,et al. Big data mining with parallel computing: A comparison of distributed and MapReduce methodologies , 2016, J. Syst. Softw..
[31] Francisco Herrera,et al. An insight into imbalanced Big Data classification: outcomes and challenges , 2017 .
[32] Gustavo E. A. P. A. Batista,et al. A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.
[33] P. Mahadevan,et al. An overview , 2007, Journal of Biosciences.
[34] Taghi M. Khoshgoftaar,et al. Data Sampling Approaches with Severely Imbalanced Big Data for Medicare Fraud Detection , 2018, 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI).
[35] Farah Magrabi,et al. Using statistical text classification to identify health information technology incidents , 2013, J. Am. Medical Informatics Assoc..
[36] Francisco Herrera,et al. Evolutionary undersampling for extremely imbalanced big data classification under apache spark , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).
[37] Haibo He,et al. ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).
[38] Oksana Yevsieieva,et al. Analysis of the impact of the slow HTTP DOS and DDOS attacks on the cloud environment , 2017, 2017 4th International Scientific-Practical Conference Problems of Infocommunications. Science and Technology (PIC S&T).
[39] Toyoo Takata,et al. A Defense Method against Distributed Slow HTTP DoS Attack , 2016, 2016 19th International Conference on Network-Based Information Systems (NBiS).
[40] J. Tukey. Comparing individual means in the analysis of variance. , 1949, Biometrics.
[41] Nitesh V. Chawla,et al. Data Mining for Imbalanced Datasets: An Overview , 2005, The Data Mining and Knowledge Discovery Handbook.
[42] Francisco Herrera,et al. Analysis of Data Preprocessing Increasing the Oversampling Ratio for Extremely Imbalanced Big Data Classification , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.
[43] Ameet Talwalkar,et al. MLlib: Machine Learning in Apache Spark , 2015, J. Mach. Learn. Res..
[44] Taghi M. Khoshgoftaar,et al. An Empirical Study on Class Rarity in Big Data , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).
[45] Jason Venner,et al. Pro Hadoop , 2009 .