Instance Selection for Imbalanced Data
暂无分享,去创建一个
Chris Cornelis | Sarah Vluymans | Yvan Saeys | Nele Verbiest | Y. Saeys | C. Cornelis | Sarah Vluymans | N. Verbiest
[1] E. B. Wilson. Probable Inference, the Law of Succession, and Statistical Inference , 1927 .
[2] Robert E. Schapire,et al. The strength of weak learnability , 1990, Mach. Learn..
[3] Lior Rokach,et al. Ensemble-based classifiers , 2010, Artificial Intelligence Review.
[4] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[5] Jesus A. Gonzalez,et al. Machine Learning for Imbalanced Datasets: Application in Medical Diagnostic , 2006, FLAIRS.
[6] Tony R. Martinez,et al. Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.
[7] Haibo He,et al. Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.
[8] Francisco Herrera,et al. OWA-FRPS: A Prototype Selection Method Based on Ordered Weighted Average Fuzzy Rough Set Theory , 2013, RSFDGrC.
[9] Peter E. Hart,et al. Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.
[10] Richard Nock,et al. Instance Pruning as an Information Preserving Problem , 2000, ICML.
[11] S. Holm. A Simple Sequentially Rejective Multiple Test Procedure , 1979 .
[12] Dennis L. Wilson,et al. Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..
[13] Shi Bing,et al. Inductive learning algorithms and representations for text categorization , 2006 .
[14] Nitesh V. Chawla,et al. SMOTEBoost: Improving Prediction of the Minority Class in Boosting , 2003, PKDD.
[15] Francisco Herrera,et al. An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics , 2013, Inf. Sci..
[16] Fabrizio Angiulli,et al. Fast Nearest Neighbor Condensation for Large Data Sets Classification , 2007, IEEE Transactions on Knowledge and Data Engineering.
[17] Francisco Herrera,et al. A memetic algorithm for evolutionary prototype selection: A scaling up approach , 2008, Pattern Recognit..
[18] Chia-Cheng Liu,et al. Design of an optimal nearest neighbor classifier using an intelligent genetic algorithm , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).
[19] Nathalie Japkowicz,et al. The Class Imbalance Problem: Significance and Strategies , 2000 .
[20] H. B. Mann,et al. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .
[21] José Salvador Sánchez,et al. Decision boundary preserving prototype selection for nearest neighbor classification , 2005, Int. J. Pattern Recognit. Artif. Intell..
[22] Filiberto Pla,et al. Prototype selection for the nearest neighbour rule through proximity graphs , 1997, Pattern Recognit. Lett..
[23] Francisco Herrera,et al. Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study , 2003, IEEE Trans. Evol. Comput..
[24] Mahendra Sahare,et al. A Review of Multi-Class Classification for Imbalanced Data , 2012 .
[25] Taeho Jo,et al. Class imbalances versus small disjuncts , 2004, SKDD.
[26] Elena Marchiori,et al. Hit Miss Networks with Applications to Instance Selection , 2008, J. Mach. Learn. Res..
[27] Thomas M. Cover,et al. Estimation by the nearest neighbor rule , 1968, IEEE Trans. Inf. Theory.
[28] Lih-Yuan Deng,et al. Orthogonal Arrays: Theory and Applications , 1999, Technometrics.
[29] Kazuo Hattori,et al. A new edited k-nearest neighbor rule in the pattern classification problem , 2000, Pattern Recognit..
[30] Francisco Herrera,et al. EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling , 2013, Pattern Recognit..
[31] David W. Aha,et al. Instance-Based Learning Algorithms , 1991, Machine Learning.
[32] Yoav Freund,et al. Boosting a weak learning algorithm by majority , 1990, COLT '90.
[33] Mikel Galar,et al. Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches , 2013, Knowl. Based Syst..
[34] Chi-Jen Lu,et al. Adaptive Prototype Learning Algorithms: Theoretical and Experimental Studies , 2006, J. Mach. Learn. Res..
[35] Yue-Shi Lee,et al. Under-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset , 2006 .
[36] Francisco Herrera,et al. A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..
[37] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .
[38] Gary M. Weiss. The Impact of Small Disjuncts on Classifier Learning , 2010, Data Mining.
[39] Szymon Wilk,et al. Learning from Imbalanced Data in Presence of Noisy and Borderline Examples , 2010, RSCTC.
[40] Taghi M. Khoshgoftaar,et al. RUSBoost: A Hybrid Approach to Alleviating Class Imbalance , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.
[41] D. Bamber. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .
[42] David W. Aha,et al. Simplifying decision trees: A survey , 1997, The Knowledge Engineering Review.
[43] Jerzy W. Grzymala-Busse,et al. An Approach to Imbalanced Data Sets Based on Changing Rule Strength , 2004, Rough-Neural Computing: Techniques for Computing with Words.
[44] M. Friedman. The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .
[45] Szymon Wilk,et al. Selective Pre-processing of Imbalanced Data for Improving Classification Performance , 2008, DaWaK.
[46] Gustavo E. A. P. A. Batista,et al. Class Imbalances versus Class Overlapping: An Analysis of a Learning System Behavior , 2004, MICAI.
[47] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[48] José Salvador Sánchez,et al. On the k-NN performance in a challenging scenario of imbalance and overlapping , 2008, Pattern Analysis and Applications.
[49] Andrew K. C. Wong,et al. Classification of Imbalanced Data: a Review , 2009, Int. J. Pattern Recognit. Artif. Intell..
[50] Misha Denil,et al. Overlap versus Imbalance , 2010, Canadian Conference on AI.
[51] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.
[52] G. Gates,et al. The reduced nearest neighbor rule (Corresp.) , 1972, IEEE Trans. Inf. Theory.
[53] Peter A. Flach,et al. Machine Learning - The Art and Science of Algorithms that Make Sense of Data , 2012 .
[54] Johannes Fürnkranz,et al. Pruning Algorithms for Rule Learning , 1997, Machine Learning.
[55] Andreas Holzinger,et al. Data Mining with Decision Trees: Theory and Applications , 2015, Online Inf. Rev..
[56] Miguel Toro,et al. Finding representative patterns with ordered projections , 2003, Pattern Recognit..
[57] David J. Hand,et al. Choosing k for two-class nearest neighbour classifiers with unbalanced classes , 2003, Pattern Recognit. Lett..
[58] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .
[59] Stan Matwin,et al. Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.
[60] Peter E. Hart,et al. The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.
[61] José Hernández-Orallo,et al. Volume under the ROC Surface for Multi-class Problems , 2003, ECML.
[62] Francisco Herrera,et al. SMOTE-RSB*: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory , 2012, Knowledge and Information Systems.
[63] Gary M. Weiss. Mining with Rare Cases , 2010, Data Mining and Knowledge Discovery Handbook.
[64] Yue-Shi Lee,et al. Cluster-based under-sampling approaches for imbalanced data distributions , 2009, Expert Syst. Appl..
[65] Gustavo E. A. P. A. Batista,et al. A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.
[66] Christopher J. C. Burges,et al. A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.
[67] Francisco Herrera,et al. Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[68] J. Hanley,et al. The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.
[69] Nello Cristianini,et al. Controlling the Sensitivity of Support Vector Machines , 1999 .
[70] Carla E. Brodley,et al. Addressing the Selective Superiority Problem: Automatic Algorithm/Model Class Selection , 1993 .
[71] D. Dubois,et al. ROUGH FUZZY SETS AND FUZZY ROUGH SETS , 1990 .
[72] Hui Han,et al. Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.
[73] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[74] Albert Y. Zomaya,et al. A particle swarm based hybrid system for imbalanced medical data sampling , 2009, BMC Genomics.
[75] Salvatore J. Stolfo,et al. Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection , 1998, KDD.
[76] I. Tomek,et al. Two Modifications of CNN , 1976 .
[77] Kihoon Yoon,et al. An unsupervised learning approach to resolving the data imbalanced issue in supervised learning problems in functional genomics , 2005, Fifth International Conference on Hybrid Intelligent Systems (HIS'05).
[78] Dragos D. Margineantu,et al. Class Probability Estimation and Cost-Sensitive Classification Decisions , 2002, ECML.
[79] J. Platt. Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .
[80] Nitesh V. Chawla,et al. Classification and knowledge discovery in protein databases , 2004, J. Biomed. Informatics.
[81] M. Narasimha Murty,et al. An incremental prototype set building technique , 2002, Pattern Recognit..
[82] Francisco Herrera,et al. Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics , 2012, Expert Syst. Appl..
[83] Roberto Alejo,et al. Analysis of new techniques to obtain quality training sets , 2003, Pattern Recognit. Lett..
[84] Chumphol Bunkhumpornpat,et al. Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem , 2009, PAKDD.
[85] David J. Hand,et al. A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.
[86] Tom Fawcett,et al. An introduction to ROC analysis , 2006, Pattern Recognit. Lett..
[87] Lotfi A. Zadeh,et al. Fuzzy Sets , 1996, Inf. Control..
[88] Jing Zhao,et al. ACOSampling: An ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data , 2013, Neurocomputing.
[89] Chih-Jen Lin,et al. A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.
[90] Edward Y. Chang,et al. Class-Boundary Alignment for Imbalanced Dataset Learning , 2003 .
[91] Larry Bull,et al. Mining breast cancer data with XCS , 2007, GECCO '07.
[92] L B Lusted,et al. Radiographic applications of receiver operating characteristic (ROC) curves. , 1974, Radiology.
[93] José Francisco Martínez Trinidad,et al. A new fast prototype selection method based on clustering , 2010, Pattern Analysis and Applications.
[94] Yu-Lin He,et al. NRMCS : Noise removing based on the MCS , 2008, 2008 International Conference on Machine Learning and Cybernetics.
[95] M. Stone. Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .
[96] Shuigeng Zhou,et al. C-pruner: an improved instance pruning algorithm , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).
[97] Stephen Kwek,et al. Applying Support Vector Machines to Imbalanced Datasets , 2004, ECML.
[98] G. Yule. On the Association of Attributes in Statistics: With Illustrations from the Material of the Childhood Society, &c , 1900 .
[99] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .
[100] Kai Ming Ting,et al. An Instance-weighting Method to Induce Cost-sensitive Trees , 2001 .
[101] Paul Jen-Hwa Hu,et al. A preclustering-based ensemble learning technique for acute appendicitis diagnoses , 2013, Artif. Intell. Medicine.
[102] Jorma Laurikkala,et al. Improving Identification of Difficult Small Classes by Balancing Class Distribution , 2001, AIME.
[103] David B. Skalak,et al. Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.
[104] Chien-Hsing Chou,et al. The Generalized Condensed Nearest Neighbor Rule as A Data Reduction Method , 2006, 18th International Conference on Pattern Recognition (ICPR'06).
[105] Ekrem Duman,et al. Comparing alternative classifiers for database marketing: The case of imbalanced datasets , 2012, Expert Syst. Appl..
[106] Marek Grochowski,et al. Comparison of Instances Seletion Algorithms I. Algorithms Survey , 2004, ICAISC.
[107] Ronald R. Yager,et al. On ordered weighted averaging aggregation operators in multicriteria decisionmaking , 1988, IEEE Trans. Syst. Man Cybern..
[108] N. Graham,et al. Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation , 2002 .
[109] Xin Yao,et al. MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning , 2014 .
[110] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..
[111] Nittaya Kerdprasop,et al. On the Generation of Accurate Predictive Model from Highly Imbalanced Data with Heuristics and Replication Techniques , 2012 .
[112] Chris Mellish,et al. Advances in Instance Selection for Instance-Based Learning Algorithms , 2002, Data Mining and Knowledge Discovery.
[113] Filiberto Pla,et al. Using the Geometrical Distribution of Prototypes for Training Set Condensing , 2003, CAEPIA.
[114] Tan Yee Fan,et al. A Tutorial on Support Vector Machine , 2009 .
[115] Francisco Herrera,et al. Preprocessing noisy imbalanced datasets using SMOTE enhanced with fuzzy rough prototype selection , 2014, Appl. Soft Comput..
[116] Francisco Herrera,et al. FRPS: A Fuzzy Rough Prototype Selection method , 2013, Pattern Recognit..
[117] Xin Yao,et al. Diversity analysis on imbalanced data sets by using ensemble models , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.
[118] Nicolás García-Pedrajas,et al. A cooperative coevolutionary algorithm for instance selection for instance-based learning , 2010, Machine Learning.
[119] Filiberto Pla,et al. A Stochastic Approach to Wilson's Editing Algorithm , 2005, IbPRIA.
[120] Francisco Herrera,et al. Improving SMOTE with Fuzzy Rough Prototype Selection to Detect Noise in Imbalanced Classification Data , 2012, IBERAMIA.
[121] Robert Sabourin,et al. Iterative Boolean combination of classifiers in the ROC space: An application to anomaly detection with HMMs , 2010, Pattern Recognit..
[122] øöö Blockinøø. Well-Trained PETs : Improving Probability Estimation , 2000 .
[123] Nathalie Japkowicz,et al. The class imbalance problem: A systematic study , 2002, Intell. Data Anal..
[124] Jerzy W. Grzymala-Busse,et al. Rough Sets , 1995, Commun. ACM.
[125] Rm Cameron-Jones,et al. Instance Selection by Encoding Length Heuristic with Random Mutation Hill Climbing , 1995 .
[126] F. Wilcoxon. Individual Comparisons by Ranking Methods , 1945 .