A MapReduce approach to diminish imbalance parameters for big deoxyribonucleic acid dataset
暂无分享,去创建一个
Nilanjan Dey | Shamim Ripon | Amira S. Ashour | V. Santhi | Md Sarwar Kamal | N. Dey | A. Ashour | Md. Sarwar Kamal | V. Santhi | Shamim Ripon
[1] Haibo He,et al. Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.
[2] Francisco Herrera,et al. Stratification for scaling up evolutionary prototype selection , 2005, Pattern Recognit. Lett..
[3] Y. Zhang,et al. IntAct—open source resource for molecular interaction data , 2006, Nucleic Acids Res..
[4] Joseph K. Bradley,et al. Spark SQL: Relational Data Processing in Spark , 2015, SIGMOD Conference.
[5] Dorian Pyle,et al. Data Preparation for Data Mining , 1999 .
[6] Francisco Herrera,et al. SMOTE-IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering , 2015, Inf. Sci..
[7] Francisco Herrera,et al. Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets , 2016, Inf. Sci..
[8] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[9] Vasile Palade,et al. Adjusted Geometric-mean: a Novel Performance Measure for Imbalanced Bioinformatics Datasets Learning , 2012, J. Bioinform. Comput. Biol..
[10] Xavier Llorà,et al. Large‐scale data mining using genetics‐based machine learning , 2013, GECCO.
[11] V. Ducrocq,et al. The Survival Kit: Software to analyze survival data including possibly correlated random effects , 2013, Comput. Methods Programs Biomed..
[12] Juan José Rodríguez Diez,et al. A weighted voting framework for classifiers ensembles , 2012, Knowledge and Information Systems.
[13] Nathalie Japkowicz,et al. The class imbalance problem: A systematic study , 2002, Intell. Data Anal..
[14] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..
[15] Teuvo Kohonen,et al. The self-organizing map , 1990 .
[16] Sanjay Ghemawat,et al. MapReduce: a flexible data processing tool , 2010, CACM.
[17] Michael Minelli,et al. Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses , 2012 .
[18] Francisco Herrera,et al. On the use of MapReduce for imbalanced big data using Random Forest , 2014, Inf. Sci..
[19] Wenqi Liu,et al. Imbalanced Multi-Modal Multi-Label Learning for Subcellular Localization Prediction of Human Proteins with Both Single and Multiple Sites , 2012, PloS one.
[20] Fabrizio Angiulli,et al. Fast Nearest Neighbor Condensation for Large Data Sets Classification , 2007, IEEE Transactions on Knowledge and Data Engineering.
[21] Francisco Herrera,et al. Fuzzy rough classifiers for class imbalanced multi-instance data , 2016, Pattern Recognit..
[22] Francisco Herrera,et al. IPADE: Iterative Prototype Adjustment for Nearest Neighbor Classification , 2010, IEEE Transactions on Neural Networks.
[23] Hanna Johannesson,et al. Intron evolution in Neurospora: the role of mutational bias and selection , 2015, Genome research.
[24] Emilio Corchado,et al. A survey of multiple classifier systems as hybrid systems , 2014, Inf. Fusion.
[25] Sébastien Destercke,et al. An extension of the FURIA classification algorithm to low quality data through fuzzy rankings and its application to the early diagnosis of dyslexia , 2016, Neurocomputing.
[26] Wai Lam,et al. Discovering Useful Concept Prototypes for Classification Based on Filtering and Abstraction , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[27] Zhe Liang,et al. Genome-Wide Identification and Evolutionary and Expression Analyses of MYB-Related Genes in Land Plants , 2013, DNA research : an international journal for rapid publication of reports on genes and genomes.
[28] Donald. Miner,et al. MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems , 2012 .
[29] Ethem Alpaydin,et al. Introduction to machine learning , 2004, Adaptive computation and machine learning.
[30] Siu-Ming Yiu,et al. MetaCluster: unsupervised binning of environmental genomic fragments and taxonomic annotation , 2010, BCB '10.
[31] Mikel Galar,et al. Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches , 2013, Knowl. Based Syst..
[32] Meena Kishore Sakharkar,et al. Distributions of exons and introns in the human genome , 2004, Silico Biol..
[33] Jeffrey P. Mower,et al. Complete plastid genomes from Ophioglossum californicum, Psilotum nudum, and Equisetum hyemale reveal an ancestral land plant genome structure and resolve the position of Equisetales among monilophytes , 2013, BMC Evolutionary Biology.
[34] Stephen E. Rees,et al. Can computed tomography classifications of chronic obstructive pulmonary disease be identified using Bayesian networks and clinical data? , 2013, Comput. Methods Programs Biomed..
[35] Vandana,et al. Survey of Nearest Neighbor Techniques , 2010, ArXiv.
[36] W. Brown,et al. Translation: DNA to mRNA to Protein , 2008 .
[37] Francisco Herrera,et al. A Taxonomy and Experimental Study on Prototype Generation for Nearest Neighbor Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[38] Evelyn Camon,et al. The EMBL Nucleotide Sequence Database , 2000, Nucleic Acids Res..
[39] Javier Pérez-Rodríguez,et al. A scalable approach to simultaneous evolutionary instance and feature selection , 2013, Inf. Sci..
[40] Francisco Herrera,et al. An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics , 2013, Inf. Sci..
[41] Francisco Herrera,et al. Evaluating the classifier behavior with noisy data considering performance and robustness: The Equalized Loss of Accuracy measure , 2016, Neurocomputing.
[42] Toshihisa Takagi,et al. DNA data bank of Japan (DDBJ) progress report , 2015, Nucleic Acids Res..
[43] Srinivas Aluru,et al. Parallel Metagenomic Sequence Clustering Via Sketching and Maximal Quasi-clique Enumeration on Map-Reduce Clouds , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[44] Ville Tirronen,et al. Scale factor local search in differential evolution , 2009, Memetic Comput..
[45] Jo McEntyre,et al. The NCBI Handbook , 2002 .
[46] M. Verleysen,et al. Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.
[47] Jingren Zhou,et al. SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..
[48] Szymon Wilk,et al. Learning from Imbalanced Data in Presence of Noisy and Borderline Examples , 2010, RSCTC.
[49] Francisco Herrera,et al. A memetic algorithm for evolutionary prototype selection: A scaling up approach , 2008, Pattern Recognit..
[50] Josef Kittler,et al. Inverse random under sampling for class imbalance problem and its application to multi-label classification , 2012, Pattern Recognit..
[51] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[52] Martin Vingron,et al. q-gram based database searching using a suffix array (QUASAR) , 1999, RECOMB.
[53] Zheng Shao,et al. Hive - a petabyte scale data warehouse using Hadoop , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).
[54] Mukesh K. Mohania,et al. Cloud Computing and Big Data Analytics: What Is New from Databases Perspective? , 2012, BDA.
[55] Francisco Herrera,et al. Data Preprocessing in Data Mining , 2014, Intelligent Systems Reference Library.
[56] Teresa K. Attwood,et al. Concepts, historical milestones and the central place of bioinformatics in modern biology: a European perspective. , 2011 .
[57] Lorena Pantano,et al. InvFEST, a database integrating information of polymorphic inversions in the human genome , 2013, Nucleic Acids Res..
[58] Tony R. Martinez,et al. Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.
[59] Josef Kittler,et al. Multilabel classification using heterogeneous ensemble of multi-label classifiers , 2012, Pattern Recognit. Lett..
[60] Fabrizio Angiulli,et al. Fast condensed nearest neighbor rule , 2005, ICML.
[61] Francisco Herrera,et al. Multivariate Discretization Based on Evolutionary Cut Points Selection for Classification , 2016, IEEE Transactions on Cybernetics.
[62] Francisco Herrera,et al. Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data , 2015, Fuzzy Sets Syst..
[63] An-Yuan Guo,et al. Evolution, functional divergence and conserved exon–intron structure of bHLH/PAS gene family , 2013, Molecular Genetics and Genomics.
[64] Ravi Kumar,et al. Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.
[65] Dennis L. Wilson,et al. Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..
[66] Athanasios V. Vasilakos,et al. Big data: From beginning to future , 2016, Int. J. Inf. Manag..
[67] Antonios Danelakis,et al. A new user-friendly visual environment for breast MRI data analysis , 2013, Comput. Methods Programs Biomed..