论文信息 - Adaptive Pricing in Insurance: Generalized Linear Models and Gaussian Process Regression Approaches - 字舞流文

Adaptive Pricing in Insurance: Generalized Linear Models and Gaussian Process Regression Approaches

We study the application of dynamic pricing to insurance. We view this as an online revenue management problem where the insurance company looks to set prices to optimize the long-run revenue from selling a new insurance product. We develop two pricing models: an adaptive Generalized Linear Model (GLM) and an adaptive Gaussian Process (GP) regression model. Both balance between exploration, where we choose prices in order to learn the distribution of demands & claims for the insurance product, and exploitation, where we myopically choose the best price from the information gathered so far. The performance of the pricing policies is measured in terms of regret: the expected revenue loss caused by not using the optimal price. As is commonplace in insurance, we model demand and claims by GLMs. In our adaptive GLM design, we use the maximum quasi-likelihood estimation (MQLE) to estimate the unknown parameters. We show that, if prices are chosen with suitably decreasing variability, the MQLE parameters eventually exist and converge to the correct values, which in turn implies that the sequence of chosen prices will also converge to the optimal price. In the adaptive GP regression model, we sample demand and claims from Gaussian Processes and then choose selling prices by the upper confidence bound rule. We also analyze these GLM and GP pricing algorithms with delayed claims. Although similar results exist in other domains, this is among the first works to consider dynamic pricing problems in the field of insurance. We also believe this is the first work to consider Gaussian Process regression in the context of insurance pricing. These initial findings suggest that online machine learning algorithms could be a fruitful area of future investigation and application in insurance.

Neil Walton | Yuqing Zhang | N. Walton | Yuqing Zhang

[1] H. Robbins,et al. Adaptive Design and Stochastic Approximation , 1979 .

[2] P. McCullagh. Quasi-Likelihood Functions , 1983 .

[3] J. Duistermaat,et al. Multidimensional Real Analysis I: Differentiation , 2004 .

[4] Benjamin Van Roy,et al. Learning to Optimize via Posterior Sampling , 2013, Math. Oper. Res..

[5] John N. Tsitsiklis,et al. Linearly Parameterized Bandits , 2008, Math. Oper. Res..

[6] M. Bartlett. An Inverse Matrix Adjustment Arising in Discriminant Analysis , 1951 .

[7] P. Kopalle,et al. Asymmetric Reference Price Effects and Dynamic Pricing Policies , 1996 .

[8] Jonas Mockus,et al. On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.

[9] Gadi Fibich,et al. Explicit Solutions of Optimization Models and Differential Games with Nonsmooth (Asymmetric) Reference-Price Effects , 2003, Oper. Res..

[10] Tze Leung Lai,et al. Consistency and asymptotic efficiency of slope estimates in stochastic approximation schemes , 1981 .

[11] Harold Hotelling,et al. Mathematical Introduction to Economics , 1931, Journal of Political Economy.

[12] R. Phillips,et al. Pricing and Revenue Optimization , 2005 .

[13] Mihaela David,et al. Auto Insurance Premium Calculation Using Generalized Linear Models , 2015 .

[14] Eric R. Ziegel,et al. Generalized Linear Models , 2002, Technometrics.

[15] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.

[16] Stefan Friedrich,et al. Topology , 2019, Arch. Formal Proofs.

[17] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[18] Shipra Agrawal,et al. Further Optimal Regret Bounds for Thompson Sampling , 2012, AISTATS.

[19] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[20] Andreas Krause,et al. Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization , 2012, ICML.

[21] Gustavo J. Vulcano,et al. Dynamic List Pricing , 2012 .

[22] Hans Bühlmann,et al. Mathematical Methods in Risk Theory , 1970 .

[23] Vianney Perchet,et al. Stochastic Bandit Models for Delayed Conversions , 2017, UAI.

[24] Bert Zwart,et al. Simultaneously Learning and Optimizing Using Controlled Variance Pricing , 2014, Manag. Sci..

[25] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[26] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[27] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[28] E. Greenleaf. The Impact of Reference Price Effects on the Profitability of Price Promotions , 1995 .

[29] Csaba Szepesvári,et al. Bandits with Delayed, Aggregated Anonymous Feedback , 2017, ICML.

[30] Aurélien Garivier,et al. Parametric Bandits: The Generalized Linear Case , 2010, NIPS.

[31] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[32] H. Robbins,et al. Strong consistency of least squares estimates in multiple regression. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[33] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[34] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[35] Claudio Gentile,et al. Delay and Cooperation in Nonstochastic Bandits , 2016, COLT.

[36] Yossi Aviv,et al. A Partially Observed Markov Decision Process for Dynamic Pricing , 2005, Manag. Sci..

[37] Kani Chen,et al. Strong consistency of maximum quasi-likelihood estimators in generalized linear models with fixed and adaptive designs , 1999 .

[38] A. V. den Boer,et al. Dynamic Pricing and Learning: Historical Origins, Current Research, and New Directions , 2013 .

[39] Omar Besbes,et al. Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms , 2009, Oper. Res..

[40] Peter W. Glynn,et al. A Nonparametric Approach to Multiproduct Pricing , 2006, Oper. Res..

[41] T. W. Anderson,et al. STRONG CONSISTENCY OF LEAST SQUARES ESTIMATES IN DYNAMIC MODELS , 1979 .

[42] D. Freedman. Another Note on the Borel-Cantelli Lemma and the Strong Law, with the Poisson Approximation as a By-product , 1973 .

[43] András György,et al. Online Learning under Delayed Feedback , 2013, ICML.

[44] John C. Duchi,et al. Distributed delayed stochastic optimization , 2011, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[45] T. Lai,et al. Least Squares Estimates in Stochastic Regression Models with Applications to Identification and Control of Dynamic Systems , 1982 .

[46] H. Robbins,et al. Iterated least squares in multiperiod control , 1982 .

[47] T. W. Anderson,et al. Strong Consistency of Least Squares Estimates in Normal Linear Regression , 1976 .

[48] Steven Haberman,et al. Generalized linear models and actuarial science , 1996 .

[49] M. Wüthrich. Non-Life Insurance: Mathematics & Statistics , 2017 .

[50] T. Lai. Stochastic approximation: invited paper , 2003 .

[51] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 1985 .

[52] G. Evans. The Dynamics of Monopoly , 1924 .

[53] Jan Dhaene,et al. Modern Actuarial Risk Theory: Using R , 2008 .

[54] Frank Thomson Leighton,et al. The value of knowing a demand curve: bounds on regret for online posted-price auctions , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[55] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[56] Jonas Mockus,et al. Bayesian Approach to Global Optimization , 1989 .

[57] Josef Broder,et al. Dynamic Pricing Under a General Parametric Choice Model , 2012, Oper. Res..

[58] Mario V. Wuthrich,et al. Data Analytics for Non-Life Insurance Pricing , 2019 .

[59] K. Talluri,et al. The Theory and Practice of Revenue Management , 2004 .

[60] J. Michael Harrison,et al. Bayesian Dynamic Pricing Policies: Learning and Earning Under a Binary Prior Distribution , 2011, Manag. Sci..

[61] R. W. Wedderburn. Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method , 1974 .

[62] Omar Besbes,et al. Blind Network Revenue Management , 2011, Oper. Res..

[63] S. M. Coutts. Motor insurance rating: An acturarial approach , 1984 .

[64] Gillian Z. Heller,et al. Generalized Linear Models for Insurance Data , 2008 .

[65] Boualem Djehiche,et al. Regression modeling with actuarial and financial applications by Edward W. Frees , 2011 .

[66] András György,et al. Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms , 2016, AAAI.

[67] Csaba Szepesvári,et al. –armed Bandits , 2022 .

[68] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[69] J. Mockus,et al. Bayesian approach to global optimization and application to multiobjective and constrained problems , 1991 .

[70] Esbjörn Ohlsson,et al. Non-Life Insurance Pricing with Generalized Linear Models , 2010 .

[71] G. Ryzin,et al. Optimal dynamic pricing of inventories with stochastic demand over finite horizons , 1994 .

[72] John Langford,et al. Efficient Optimal Learning for Contextual Bandits , 2011, UAI.

[73] Y. Chow. Local Convergence of Martingales and the Law of Large Numbers , 1965 .

[74] Eli Upfal,et al. Multi-Armed Bandits in Metric Spaces ∗ , 2008 .

[75] Eric Cope. Bayesian strategies for dynamic pricing in e‐commerce , 2007 .

[76] R. Bailey,et al. Two Studies in Automobile Insurance Ratemaking , 1960, ASTIN Bulletin.

[77] Assaf J. Zeevi,et al. Dynamic Pricing with an Unknown Demand Model: Asymptotically Optimal Semi-Myopic Policies , 2014, Oper. Res..