Large-Scale Price Interval Prediction at OTA Sites

With the rapid growing proliferation of online travel agent (OTA) services, personalized recommendations are highly valuable as they can improve customer experiences by preventing the information overload problem. The accurate prediction of users’ expectations for price plays an important role in the personalized recommendation of hotels in OTA platforms. Considering that customers’ preferences for hotel prices are actually acceptable ranges and traditional point estimations may neglect some informative aspects of the prediction, interval estimation is more suitable for the problem investigated in this paper. However, existing related methods are not applicable due to some specific issues. To provide a better personalized recommendation of hotels in OTA platforms, this paper proposes a novel interval forecasting solution to improve the accuracy of predicting users’ price preferences. The novel interval forecasting solution first puts forward a customized objective function which could directly measure the quality of constructed intervals, while also allowing for adjustable tradeoffs between interval tightness and prediction reliability. Then, it combines alternating direction optimization and the gradient boosting framework to efficiently aggregate weak individual predictors to optimize the introduced learning objective. Empirical comparisons conducted on several benchmark standard datasets and a large-scale dataset shared with us by a major Chinese OTA site demonstrate the effectiveness of the proposed approach.

[1]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[2]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[3]  Urs Niesen,et al.  Adaptive Alternating Minimization Algorithms , 2007, IEEE Transactions on Information Theory.

[4]  Durga L. Shrestha,et al.  Machine learning approaches for estimation of prediction interval for the model output , 2006, Neural Networks.

[5]  Henrik Madsen,et al.  Using quantile regression to extend an existing wind power forecasting system with probabilistic forecasts , 2006 .

[6]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  John Bjørnar Bremnes,et al.  Probabilistic wind power forecasts using local quantile regression , 2004 .

[9]  R. Law,et al.  An Examination of the Relationship between Online Travel Agents and Hotels , 2013 .

[10]  David J. C. MacKay,et al.  The Evidence Framework Applied to Classification Networks , 1992, Neural Computation.

[11]  Jaap Kamps,et al.  The Continuous Cold-start Problem in e-Commerce Recommender Systems , 2015, CBRecSys@RecSys.

[12]  Nicolai Meinshausen,et al.  Quantile Regression Forests , 2006, J. Mach. Learn. Res..

[13]  Pascal Fua,et al.  Non-Linear Domain Adaptation with Boosting , 2013, NIPS.

[14]  Chenchen Yang,et al.  Opening the online marketplace: An examination of hotel pricing and travel agency on-line distribution of rooms , 2014 .

[15]  Hideo Tanaka,et al.  Upper and lower approximation models in interval regression using regression quantile techniques , 1999, Eur. J. Oper. Res..

[16]  Xinjun Peng,et al.  TSVR: An efficient Twin Support Vector Machine for regression , 2010, Neural Networks.

[17]  Alexander J. Smola,et al.  Nonparametric Quantile Estimation , 2006, J. Mach. Learn. Res..

[18]  Ying Wei,et al.  Computational Issues for Quantile Regression , 2005 .

[19]  R. Schmoyer Asymptotically valid prediction intervals for linear models , 1992 .

[20]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[21]  Djoerd Hiemstra,et al.  Where to Go on Your Next Trip?: Optimizing Travel Destinations Based on User Preferences , 2015, SIGIR.

[22]  Qiang Yang,et al.  Telco Churn Prediction with Big Data , 2015, SIGMOD Conference.

[23]  Ying Liu,et al.  Simulation-efficient shortest probability intervals , 2013, Stat. Comput..

[24]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[25]  Tie-Yan Liu,et al.  A Communication-Efficient Parallel Algorithm for Decision Tree , 2016, NIPS.

[26]  R. Stine Bootstrap Prediction Intervals for Regression , 1985 .

[27]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[28]  Fabrizio Silvestri,et al.  Interpretable Predictions of Tree-based Ensembles via Actionable Feature Tweaking , 2017, KDD.

[29]  Bijaya K. Panigrahi,et al.  Prediction Interval Estimation of Electricity Prices Using PSO-Tuned Support Vector Machines , 2015, IEEE Transactions on Industrial Informatics.

[30]  Hongyuan Zha,et al.  A General Boosting Method and its Application to Learning Ranking Functions for Web Search , 2007, NIPS.

[31]  Yi Lin,et al.  Random Forests and Adaptive Nearest Neighbors , 2006 .

[32]  Bing Pan,et al.  The Complex Matter of Online Hotel Choice , 2013 .

[33]  Qiang Wu,et al.  Adapting boosting for information retrieval measures , 2010, Information Retrieval.

[34]  Rex S. Toh,et al.  Travel Planning , 2011 .

[35]  Ya Zhang,et al.  Boosted multi-task learning , 2010, Machine Learning.

[36]  Naomi S. Altman,et al.  Quantile regression , 2019, Nature Methods.

[37]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[38]  Tom Heskes,et al.  Practical Confidence and Prediction Intervals , 1996, NIPS.