Applying computational intelligence methods for predicting the sales of newly published books in a real editorial business management environment

Abstract When a new book is launched the publisher faces the problem of how many books should be printed for delivery to bookstores; printing too many is the main issue, since it implies a loss of investment due to inventory excess, but printing too few will also have a negative economic impact. In this paper, we are tackling the problem of predicting total sales in order to print the right amount of books and doing so even before the book has reached the stores. A real dataset including the complete sales data for books published in Spain across several years has been used. We have conducted an analysis in three stages: an initial exploratory analysis, by means of data visualisation techniques; a feature selection process, using different techniques to find out what are the variables that have more impact on sales; and a regression or prediction stage, in which a set of machine learning methods has been applied to create forecasting models for book sales. The obtained models are able to predict sales from pre-publication data with remarkable accuracy, and can be visualised as simple decision trees. Thus, these can be used as decision-aid tools for publishers, which can provide a reliable guidance on the decision process of publishing a book. This is also shown in the paper by addressing four example cases of representative publishers, regarding their number of sales and the number of different books they sell.

[1]  Ming-Huwi Horng,et al.  The Construction of Support Vector Machine Classifier Using the Firefly Algorithm , 2015, Comput. Intell. Neurosci..

[2]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[3]  Young-Chan Lee,et al.  Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters , 2005, Expert Syst. Appl..

[4]  Robert Fildes,et al.  Incorporating demand uncertainty and forecast error in supply chain planning models , 2011, J. Oper. Res. Soc..

[5]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[6]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[7]  Guang-Bin Huang,et al.  Trends in extreme learning machines: A review , 2015, Neural Networks.

[8]  Larry Lapide,et al.  New Developments in Business Forecasting , 1999 .

[9]  Peter S. Fader,et al.  The value of simple models in new product forecasting and customer-base analysis , 2005 .

[10]  Richard L. Smith,et al.  PREDICTIVE INFERENCE , 2004 .

[11]  S. Sathiya Keerthi,et al.  Improvements to the SMO algorithm for SVM regression , 2000, IEEE Trans. Neural Networks Learn. Syst..

[12]  King-Sun Fu,et al.  Handbook of pattern recognition and image processing , 1986 .

[13]  James D. Hamilton Time Series Analysis , 1994 .

[14]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[15]  Lijuan Cao,et al.  Support vector machines experts for time series forecasting , 2003, Neurocomputing.

[16]  Ian H. Witten,et al.  Induction of model trees for predicting continuous classes , 1996 .

[17]  Sébastien Thomassey,et al.  Sales Forecasting in Apparel and Fashion Industry: A Review , 2014 .

[18]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[19]  J. R. Quinlan Learning With Continuous Classes , 1992 .

[20]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[21]  Seyed Mohammad Mirjalili How effective is the Grey Wolf optimizer in training multi-layer perceptrons , 2014, Applied Intelligence.

[22]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[23]  Takao Terano,et al.  Categorization of sales patterns to use blog information as book sales prediction , 2010, ICEC '10.

[24]  N. Sanders,et al.  Forecasting Practices in US Corporations: Survey Results , 1994 .

[25]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[26]  K. Selvakuberan,et al.  Combined Feature Selection and classification – A novel approach for the categorization of web pages , 2008 .

[27]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[28]  David Enke,et al.  The use of data mining and neural networks for forecasting stock market returns , 2005, Expert Syst. Appl..

[29]  Min Xia,et al.  Fashion retailing forecasting based on extreme learning machine with adaptive metrics of inputs , 2012, Knowl. Based Syst..

[30]  Lawrence Carin,et al.  A Bayesian approach to joint feature selection and classifier design , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Ching-Chin Chern,et al.  Designing a decision-support system for new product sales forecasting , 2010, Expert Syst. Appl..

[32]  Cheng-Lung Huang,et al.  A GA-based feature selection and parameters optimizationfor support vector machines , 2006, Expert Syst. Appl..

[33]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[34]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[35]  George Athanasopoulos,et al.  Forecasting: principles and practice , 2013 .

[36]  Bernard Widrow,et al.  30 years of adaptive neural networks: perceptron, Madaline, and backpropagation , 1990, Proc. IEEE.

[37]  S. Liong,et al.  EC-SVM approach for real-time hydrologic forecasting , 2004 .

[38]  Dayou Liu,et al.  Evolving support vector machines using fruit fly optimization for medical data classification , 2016, Knowl. Based Syst..

[39]  Dimitri P. Solomatine,et al.  M5 Model Trees and Neural Networks: Application to Flood Forecasting in the Upper Reach of the Huai River in China , 2004 .

[40]  Andrew Lewis,et al.  Let a biogeography-based optimizer train your Multi-Layer Perceptron , 2014, Inf. Sci..

[41]  Marko Robnik-Sikonja,et al.  An adaptation of Relief for attribute estimation in regression , 1997, ICML.

[42]  Xiaoou Li,et al.  Support vector machine classification for large data sets via minimum enclosing ball clustering , 2008, Neurocomputing.

[43]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1972 .

[44]  Takao Terano,et al.  Blog information considered useful for book sales prediction , 2010, 2010 7th International Conference on Service Systems and Service Management.

[45]  Peter R. Winters,et al.  Forecasting Sales by Exponentially Weighted Moving Averages , 1960 .

[46]  Ian Witten,et al.  Data Mining , 2000 .

[47]  H. Yoo,et al.  Short term load forecasting using a self-supervised adaptive neural network , 1999 .

[48]  Jian Zhang,et al.  Deep Extreme Learning Machine and Its Application in EEG Classification , 2015 .

[49]  Yumin Chen,et al.  A rough set approach to feature selection based on power set tree , 2011, Knowl. Based Syst..

[50]  Paris A. Mastorocostas,et al.  A constrained orthogonal least-squares method for generating TSK fuzzy models: Application to short-term load forecasting , 2001, Fuzzy Sets Syst..

[51]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[52]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[53]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[54]  Xin Yan,et al.  Linear Regression Analysis: Theory and Computing , 2009 .

[55]  Chih-Ping Wei,et al.  A sales forecasting model for consumer products based on the influence of online word-of-mouth , 2015, Inf. Syst. E Bus. Manag..

[56]  T. Hesterberg,et al.  A regression-based approach to short-term system load forecasting , 1989, Conference Papers Power Industry Computer Application Conference.

[57]  Lisa Werner,et al.  Principles of forecasting: A handbook for researchers and practitioners , 2002 .

[58]  Ismail Wdaa,et al.  Differential evolution for neural networks learning enhancement , 2008 .

[59]  Xiande Zhao,et al.  Improving the supply chain performance: Use of forecasting models versus early order commitments , 2001 .

[60]  Khalid Saeed,et al.  Trend forecasting for stability in supply chains , 2008 .

[61]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[62]  R. Lippmann,et al.  An introduction to computing with neural nets , 1987, IEEE ASSP Magazine.

[63]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[64]  Barbara Pfeffer,et al.  Smoothing Forecasting And Prediction Of Discrete Time Series , 2016 .

[65]  Natarajan Meghanathan,et al.  USING MACHINE LEARNING ALGORITHMS TO ANALYZE CRIME DATA , 2015 .

[66]  Pei-Chann Chang,et al.  Data clustering and fuzzy neural network for sales forecasting: A case study in printed circuit board industry , 2009, Knowl. Based Syst..

[67]  U. Rajendra Acharya,et al.  Evolutionary algorithm based classifier parameter tuning for automatic diabetic retinopathy grading: A hybrid feature extraction approach , 2013, Knowl. Based Syst..

[68]  Sébastien Thomassey,et al.  Sales forecasts in clothing industry: The key success factor of the supply chain management , 2010 .

[69]  David Zipser,et al.  Feature Discovery by Competive Learning , 1986, Cogn. Sci..

[70]  John D. C. Little,et al.  Integrated measures of sales, merchandising, and distribution , 1998 .

[71]  Okyay Kaynak,et al.  Grey system theory-based models in time series prediction , 2010, Expert Syst. Appl..

[72]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[73]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[74]  G. Zhang,et al.  A comparative study of linear and nonlinear models for aggregate retail sales forecasting , 2003 .