DPASF: a flink library for streaming data preprocessing
暂无分享,去创建一个
Francisco Herrera | Salvador García | Diego García-Gil | Alejandro Alcalde-Barros | S. García | F. Herrera | Diego García-Gil | Alejandro Alcalde-Barros
[1] Taghi M. Khoshgoftaar,et al. A survey of open source tools for machine learning with big data in the Hadoop ecosystem , 2015, Journal of Big Data.
[2] Fahad Saeed,et al. Towards quantifying psychiatric diagnosis using machine learning algorithms and big fMRI data , 2018, Big Data Analytics.
[3] Verónica Bolón-Canedo,et al. Data discretization: taxonomy and big data challenge , 2016, WIREs Data Mining Knowl. Discov..
[4] João Gama,et al. A survey on concept drift adaptation , 2014, ACM Comput. Surv..
[5] William H. Press,et al. Numerical recipes in C , 2002 .
[6] Francisco Herrera,et al. Tutorial on practical tips of the most influential data preprocessing algorithms in data mining , 2016, Knowl. Based Syst..
[7] Jan van Leeuwen,et al. Interval Heaps , 1993, Comput. J..
[8] Huan Liu,et al. Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.
[9] Rong Jin,et al. Online Feature Selection and Its Applications , 2014, IEEE Transactions on Knowledge and Data Engineering.
[10] João Gama,et al. Discretization from data streams: applications to histograms and data mining , 2006, SAC.
[11] Francisco Herrera,et al. Enabling Smart Data: Noise filtering in Big Data classification , 2017, Inf. Sci..
[12] Francisco Herrera,et al. Data Preprocessing in Data Mining , 2014, Intelligent Systems Reference Library.
[13] Sabine Loudcher,et al. FUSINTER: A Method for Discretization of Continuous Attributes , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..
[14] Usama M. Fayyad,et al. On the Handling of Continuous-Valued Attributes in Decision Tree Generation , 1992, Machine Learning.
[15] Grigorios Tsoumakas,et al. On the Utility of Incremental Feature Selection for the Classification of Textual Data Streams , 2005, Panhellenic Conference on Informatics.
[16] Usama M. Fayyad,et al. Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.
[17] Francisco Herrera,et al. Principal Components Analysis Random Discretization Ensemble for Big Data , 2018, Knowl. Based Syst..
[18] Jeffrey Scott Vitter,et al. Random sampling with a reservoir , 1985, TOMS.
[19] Francisco Herrera,et al. Big data preprocessing: methods and prospects , 2016 .
[20] Kostas Tzoumas,et al. Introduction to Apache Flink: Stream Processing for Real Time and Beyond , 2016 .
[21] Verónica Bolón-Canedo,et al. An Information Theory-Based Feature Selection Framework for Big Data Under Apache Spark , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.
[22] Mohamed Medhat Gaber,et al. Learning from Data Streams: Processing Techniques in Sensor Networks , 2007 .
[23] Francisco Herrera,et al. Big Data: Tutorial and guidelines on information and process fusion for analytics algorithms with MapReduce , 2018, Inf. Fusion.
[24] Geoffrey I. Webb. Contrary to Popular Belief Incremental Discretization can be Sound, Computationally Efficient and Extremely Useful for Streaming Data , 2014, 2014 IEEE International Conference on Data Mining.
[25] Francisco Herrera,et al. A survey on data preprocessing for data stream mining: Current status and future directions , 2017, Neurocomputing.
[26] Seif Haridi,et al. Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..
[27] Francisco Herrera,et al. A comparison on scalability for batch big data processing on Apache Spark and Apache Flink , 2017 .
[28] S. García,et al. Online entropy-based discretization for data streaming classification , 2018, Future generations computer systems.
[29] Pabitra Mitra,et al. The big data system, components, tools, and technologies: a survey , 2018, Knowledge and Information Systems.
[30] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .