Machine Learning and Knowledge Discovery in Databases

With the rapid growth of e-commerce, a large number of online transactions are processed every day. In this paper, we take the initiative to conduct a systematic study of the challenging prediction problems of sales bursts. Here, we propose a novel model to detect bursts, find the bursty features, namely the start time of the burst, the peak value of the burst and the off-burst value, and predict the entire burst shape. Our model analyzes the features of similar sales bursts in the same category, and applies them to generate the prediction. We argue that the framework is capable of capturing the seasonal and categorical features of sales burst. Based on the real data from JD.com, we conduct extensive experiments and discover that the proposed model makes a relative MSE improvement of 71% and 30% over LSTM and ARMA.

[1]  Nitesh V. Chawla,et al.  Inferring user demographics and social strategies in mobile social networks , 2014, KDD.

[2]  Alex Pentland,et al.  Predicting Personality Using Novel Mobile Phone-Based Metrics , 2013, SBP.

[3]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[4]  Alex 'Sandy' Pentland,et al.  bandicoot: a Python Toolbox for Mobile Phone Metadata , 2016, J. Mach. Learn. Res..

[5]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[6]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[7]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[8]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Vanessa Frías-Martínez,et al.  A Gender-Centric Analysis of Calling Behavior in a Developing Economy Using Call Detail Records , 2010, AAAI Spring Symposium: Artificial Intelligence for Development.

[11]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[12]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[13]  Pedro J. Zufiria,et al.  Prediction of Telephone User Attributes Based on Network Neighborhood Information , 2012, MLDM.

[14]  Carlos Sarraute,et al.  A study of age and gender seen through mobile phone usage patterns in Mexico , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[15]  Charu C. Aggarwal,et al.  Neural Networks and Deep Learning , 2018, Springer International Publishing.

[16]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[17]  L. Bengtsson,et al.  Improved Response to Disasters and Outbreaks by Tracking Population Movements with Mobile Phone Network Data: A Post-Earthquake Geospatial Study in Haiti , 2011, PLoS medicine.

[18]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[19]  Alex 'Sandy' Pentland,et al.  Improving official statistics in emerging markets using machine learning and mobile phone data , 2017, EPJ Data Science.

[20]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[21]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[22]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.