Extraction and Energy Efficient Processing of Streaming Data

The interest in machine learning algorithms is increasing, in parallel with the advancements in hardware and software required to mine large-scale datasets. Machine learning algorithms account for ...

[1]  Stephen Berard,et al.  ASSESSING TRENDS IN THE ELECTRICAL EFFICIENCY OF COMPUTATION OVER TIME , 2009 .

[2]  M. Newman Clustering and preferential attachment in growing networks. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[3]  Gianmarco De Francisci Morales,et al.  Distributed Decision Tree Learning for Mining Big Data Streams , 2013 .

[4]  Kevin Klues,et al.  Improving per-node efficiency in the datacenter with new OS abstractions , 2011, SoCC.

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[7]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[8]  Håkan Grahn,et al.  Hoeffding Trees with Nmin Adaptation , 2018, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA).

[9]  Anton Borg,et al.  On Descriptive and Predictive Models for Serial Crime Analysis , 2014 .

[10]  Håkan Grahn,et al.  Identification of Energy Hotspots: A Case Study of the Very Fast Decision Tree , 2017, GPC.

[11]  Latifur Khan,et al.  IoT Big Data Stream Mining , 2016, KDD.

[12]  Mor Naaman,et al.  Network properties and social sharing of emotions in social awareness streams , 2011, CSCW.

[13]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[14]  Sudipto Guha,et al.  Clustering Data Streams , 2000, FOCS.

[15]  Peter A. Flach,et al.  Machine Learning - The Art and Science of Algorithms that Make Sense of Data , 2012 .

[16]  João Gama,et al.  Accurate decision trees for mining high-speed data streams , 2003, KDD '03.

[17]  Kang G. Shin,et al.  Real-time dynamic voltage scaling for low-power embedded operating systems , 2001, SOSP.

[18]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[19]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[20]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[21]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[22]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[23]  AN Kolmogorov-Smirnov,et al.  Sulla determinazione empírica di uma legge di distribuzione , 1933 .

[24]  Richard Brendon Kirkby,et al.  Improving Hoeffding Trees , 2007 .

[25]  Z. Neda,et al.  Measuring preferential attachment in evolving networks , 2001, cond-mat/0104131.

[26]  Philip S. Yu,et al.  On demand classification of data streams , 2004, KDD.

[27]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[28]  Zhe Zhao,et al.  Real-Time Predicting Bursting Hashtags on Twitter , 2014, WAIM.

[29]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[30]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[31]  Kevin Makice TWITTER API : UP AND RUNNING , 2009 .

[32]  H. J. Arnold Introduction to the Practice of Statistics , 1990 .

[33]  H. Zimmermann,et al.  OSI Reference Model - The ISO Model of Architecture for Open Systems Interconnection , 1980, IEEE Transactions on Communications.

[34]  João Gama,et al.  Learning Decision Rules from Data Streams , 2011, IJCAI.

[35]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[36]  Romain Rouvoy,et al.  Monitoring energy hotspots in software , 2014, Automated Software Engineering.

[37]  Qiang Ding,et al.  Decision tree classification of spatial data streams using Peano Count Trees , 2002, SAC '02.

[38]  Ranveer Chandra,et al.  Empowering developers to estimate app energy consumption , 2012, Mobicom '12.

[39]  Yalou Huang,et al.  What to Tag Your Microblog: Hashtag Recommendation Based on Topic Analysis and Collaborative Filtering , 2014, APWeb.

[40]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[41]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[42]  M. Kubát An Introduction to Machine Learning , 2017, Springer International Publishing.

[43]  Mohamed Medhat Gaber,et al.  Data Stream Processing in Sensor Networks , 2007 .

[44]  Jesús S. Aguilar-Ruiz,et al.  Knowledge discovery from data streams , 2009, Intell. Data Anal..

[45]  Michael I. Jordan,et al.  Machine learning: Trends, perspectives, and prospects , 2015, Science.

[46]  Stuart J. Russell,et al.  Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[47]  Jian Li,et al.  Power-performance considerations of parallel computing on chip multiprocessors , 2005, TACO.

[48]  Ricard Gavaldà,et al.  Adaptive Learning from Evolving Data Streams , 2009, IDA.

[49]  João Gama,et al.  Forest trees for on-line data , 2004, SAC '04.

[50]  Ed H. Chi,et al.  Want to be Retweeted? Large Scale Analytics on Factors Impacting Retweet in Twitter Network , 2010, 2010 IEEE Second International Conference on Social Computing.

[51]  Jesús S. Aguilar-Ruiz,et al.  Discovering decision rules from numerical data streams , 2004, SAC '04.

[52]  Amir F. Atiya,et al.  An Empirical Comparison of Machine Learning Models for Time Series Forecasting , 2010 .

[53]  Maliha S. Nash,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 2001, Technometrics.

[54]  Stephen Ruth,et al.  Green IT More Than a Three Percent Solution? , 2009, IEEE Internet Computing.

[55]  Rakesh Agrawal,et al.  SPRINT: A Scalable Parallel Classifier for Data Mining , 1996, VLDB.

[56]  Edward I. Altman,et al.  FINANCIAL RATIOS, DISCRIMINANT ANALYSIS AND THE PREDICTION OF CORPORATE BANKRUPTCY , 1968 .

[57]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[58]  Lei Chen,et al.  TOMOHA: TOpic model-based HAshtag recommendation on twitter , 2014, WWW.

[59]  Heiko Wersing,et al.  KNN Classifier with Self Adjusting Memory for Heterogeneous Concept Drift , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[60]  Johannes Gehrke,et al.  Querying and mining data streams: you only get one look a tutorial , 2002, SIGMOD '02.

[61]  Din J. Wasem,et al.  Mining of Massive Datasets , 2014 .

[62]  T. Bayes An essay towards solving a problem in the doctrine of chances , 2003 .

[63]  S. R,et al.  Data Mining with Big Data , 2017, 2017 11th International Conference on Intelligent Systems and Control (ISCO).

[64]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[65]  Carolyn Pillers Dobler,et al.  The Practice of Statistics , 2001, Technometrics.

[66]  Charles Reams,et al.  Modelling energy efficiency for computation , 2012 .

[67]  Mohamed Medhat Gaber,et al.  Towards an Adaptive Approach for Mining Data Streams in Resource Constrained Environments , 2004, DaWaK.

[68]  P. Lazarsfeld,et al.  Personal Influence: The Part Played by People in the Flow of Mass Communications , 1956 .

[69]  Mehmet Demirci,et al.  A Survey of Machine Learning Applications for Energy-Efficient Resource Management in Cloud Computing Environments , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[70]  Mike Thelwall,et al.  Sentiment in Twitter events , 2011, J. Assoc. Inf. Sci. Technol..

[71]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[72]  Eric Gilbert,et al.  A longitudinal study of follow predictors on twitter , 2013, CHI.

[73]  Mark Last,et al.  Online classification of nonstationary data streams , 2002, Intell. Data Anal..

[74]  Håkan Grahn,et al.  Energy efficiency in data stream mining , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[75]  Andrew B. Whinston,et al.  A Twitter-Based Prediction Market: Social Network Approach , 2011, ICIS.

[76]  Eva García Martín,et al.  Hashtags and followers An experimental study of the online social network Twitter , 2013 .

[77]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[78]  Gianmarco De Francisci Morales SAMOA: a platform for mining big data streams , 2013, WWW '13 Companion.

[79]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[80]  Miguel P Caldas,et al.  Research design: qualitative, quantitative, and mixed methods approaches , 2003 .

[81]  San Murugesan,et al.  Harnessing Green IT: Principles and Practices , 2008, IT Professional.

[82]  Scott A. Wallace,et al.  Design and evaluation of a Twitter hashtag recommendation system , 2014, IDEAS.

[83]  John McCarthy,et al.  A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence, August 31, 1955 , 2006, AI Mag..

[84]  Xiaolong Wang,et al.  Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach , 2011, CIKM '11.

[85]  W. Shadish,et al.  Experimental and Quasi-Experimental Designs for Generalized Causal Inference , 2001 .

[86]  Qin Ding,et al.  k-nearest Neighbor Classification on Spatial Data Streams Using P-trees , 2002, PAKDD.

[87]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[88]  R.W. Brodersen,et al.  A dynamic voltage scaled microprocessor system , 2000, IEEE Journal of Solid-State Circuits.

[89]  Alexandra Weilenmann,et al.  FISHING FOR FOLLOWERS: USING HASHTAGS AS LIKE BAIT IN SOCIAL MEDIA , 2014 .

[90]  Kilian Stoffel,et al.  Theoretical Comparison between the Gini Index and Information Gain Criteria , 2004, Annals of Mathematics and Artificial Intelligence.

[91]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[92]  James Bennett,et al.  The Netflix Prize , 2007 .

[93]  Carlo Zaniolo,et al.  An Adaptive Nearest Neighbor Classification Algorithm for Data Streams , 2005, PKDD.

[94]  A. Winsor Sampling techniques. , 2000, Nursing times.

[95]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[96]  Erik Jagroep,et al.  Awakening Awareness on Energy Consumption in Software Engineering , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Society Track (ICSE-SEIS).

[97]  Albert Bifet,et al.  Sentiment Knowledge Discovery in Twitter Streaming Data , 2010, Discovery Science.

[98]  C. Peirce,et al.  Collected Papers of Charles Sanders Peirce , 1936, Nature.

[99]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[100]  Jianjun Yu,et al.  Evolutionary Personalized Hashtag Recommendation , 2014, WAIM.

[101]  Peter Marwedel,et al.  mmapcopy: efficient memory footprint reduction using application knowledge , 2016, SAC.

[102]  Navdeep Jaitly,et al.  Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition , 2012, INTERSPEECH.

[103]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[104]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[105]  Teng Wang,et al.  The influence of feedback with different opinions on continued user participation in online newsgroups , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[106]  Ronan Collobert,et al.  Learning to Segment Object Candidates , 2015, NIPS.

[107]  Sharad Malik,et al.  Power analysis of embedded software: a first step towards software power minimization , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[108]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[109]  Craig MacDonald,et al.  A self-adapting latency/power tradeoff model for replicated search engines , 2014, WSDM.

[110]  Mahadev Satyanarayanan,et al.  PowerScope: a tool for profiling the energy usage of mobile applications , 1999, Proceedings WMCSA'99. Second IEEE Workshop on Mobile Computing Systems and Applications.

[111]  Stijn Eyerman,et al.  An Evaluation of High-Level Mechanistic Core Models , 2014, ACM Trans. Archit. Code Optim..

[112]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[113]  Wei Fan,et al.  Extremely Fast Decision Tree Mining for Evolving Data Streams , 2017, KDD.

[114]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[115]  João Gama,et al.  Learning decision trees from dynamic data streams , 2005, SAC '05.

[116]  Ravi Kumar,et al.  Structure and evolution of online social networks , 2006, KDD '06.

[117]  Geoff Hulten,et al.  Mining complex models from arbitrarily large databases in constant time , 2002, KDD.

[118]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[119]  Håkan Grahn,et al.  Energy Efficiency Analysis of the Very Fast Decision Tree Algorithm , 2017 .

[120]  Mor Naaman,et al.  The impact of network structure on breaking ties in online social networks: unfollowing on twitter , 2011, CHI.

[121]  Patrick Kurp,et al.  Green computing , 2008, Commun. ACM.

[122]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[123]  João Gama,et al.  Learning with Drift Detection , 2004, SBIA.

[124]  David A. Patterson,et al.  In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[125]  Gianmarco De Francisci Morales,et al.  VHT: Vertical hoeffding tree , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[126]  Geoff Holmes,et al.  New Options for Hoeffding Trees , 2007, Australian Conference on Artificial Intelligence.

[127]  Shyhtsun Felix Wu,et al.  Anti-preferential Attachment: If I Follow You, Will You Follow Me? , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[128]  Albert Bifet,et al.  DATA STREAM MINING A Practical Approach , 2009 .

[129]  Daniele Quercia,et al.  In the Mood for Being Influential on Twitter , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[130]  Hillol Kargupta,et al.  Energy Consumption in Data Analysis for On-board and Distributed Applications , 2003 .

[131]  Mohamed Medhat Gaber,et al.  On-board Mining of Data Streams in Sensor Networks , 2005 .

[132]  Vivienne Sze,et al.  Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[133]  David A. Shamma,et al.  Characterizing debate performance via aggregated twitter sentiment , 2010, CHI.

[134]  Jordi Torres,et al.  Towards energy-aware scheduling in data centers using machine learning , 2010, e-Energy.

[135]  B. Welford Note on a Method for Calculating Corrected Sums of Squares and Products , 1962 .

[136]  Paul E. Utgoff,et al.  Incremental Induction of Decision Trees , 1989, Machine Learning.

[137]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[138]  Shonali Krishnaswamy,et al.  Mining data streams: a review , 2005, SGMD.

[139]  Charu C. Aggarwal,et al.  A framework for diagnosing changes in evolving data streams , 2003, SIGMOD '03.

[140]  M. Harries SPLICE-2 Comparative Evaluation: Electricity Pricing , 1999 .

[141]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[142]  Alan D. Lopez,et al.  The Global Burden of Disease Study , 2003 .

[143]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[144]  Yanpei Chen,et al.  Energy efficiency for large-scale MapReduce workloads with significant interactive analysis , 2012, EuroSys '12.

[145]  E. Rogers,et al.  Diffusion of innovations , 1964, Encyclopedia of Sport Management.

[146]  Roozbeh Nia,et al.  Leveraging Social Interactions to Suggest Friends , 2013, 2013 IEEE 33rd International Conference on Distributed Computing Systems Workshops.

[147]  João Gama,et al.  Learning from Data Streams , 2009, Encyclopedia of Data Warehousing and Mining.

[148]  M. Osborne,et al.  Using Prediction Markets and Twitter to Predict a Swine Flu Pandemic , 2009 .

[149]  Ruoming Jin,et al.  Efficient decision tree construction on streaming data , 2003, KDD '03.

[150]  Peter Druschel,et al.  Online social networks: measurement, analysis, and applications to distributed information systems , 2009 .

[151]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[152]  Romain Rouvoy,et al.  PowerAPI: A Software Library to Monitor the Energy Consumed at the Process-Level , 2013, ERCIM News.

[153]  Geoff Holmes,et al.  Stress-Testing Hoeffding Trees , 2005, PKDD.

[154]  Saso Dzeroski,et al.  Learning model trees from evolving data streams , 2010, Data Mining and Knowledge Discovery.