Recovering Markov models from closed-loop data

Situations in which recommender systems are used to augment decision making are becoming prevalent in many application domains. Almost always, these prediction tools (recommenders) are created with a view to affecting behavioural change. Clearly, successful applications actuating behavioural change, affect the original model underpinning the predictor, leading to an inconsistency. This feedback loop is often not considered in standard machine learning techniques which rely upon machine learning/statistical learning machinery. The objective of this paper is to develop tools that recover unbiased user models in the presence of recommenders. More specifically, we assume that we observe a time series which is a trajectory of a Markov chain R modulated by another Markov chain S, i.e. the transition matrix of R is unknown and depends on the current state of S. The transition matrix of the latter is also unknown. In other words, at each time instant, S selects a transition matrix for R within a given set which consists of known and unknown matrices. The state of S, in turn, depends on the current state of R thus introducing a feedback loop. We propose an Expectation–Maximisation (EM) type algorithm, which estimates the transition matrices of S and R. Experimental results are given to demonstrate the efficacy of the approach.

[1]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[2]  D. Blackwell,et al.  On the Identifiability Problem for Functions of Finite Markov Chains , 1957 .

[3]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[4]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[5]  Robert Shorten,et al.  A hidden Markov model for route and destination prediction , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[6]  R. P. Marques,et al.  Discrete-Time Markov Jump Linear Systems , 2004, IEEE Transactions on Automatic Control.

[7]  Brett Browning,et al.  Learning to Predict Driver Route and Destination Intent , 2006, 2006 IEEE Intelligent Transportation Systems Conference.

[8]  J. Stock,et al.  Retrospectives Who Invented Instrumental Variable Regression , 2003 .

[9]  T Petrie,et al.  Probabilistic functions of finite-state markov chains. , 1967, Proceedings of the National Academy of Sciences of the United States of America.

[10]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[11]  John Riedl,et al.  Is seeing believing?: how recommender system interfaces affect users' opinions , 2003, CHI '03.

[12]  G. Lindgren Markov regime models for mixed distributions and switching regressions , 1978 .

[13]  Robert Shorten,et al.  Bayesian classifier for Route prediction with Markov chains , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[14]  Joshua D. Angrist,et al.  Mostly Harmless Econometrics: An Empiricist's Companion , 2008 .

[15]  Theis Lange,et al.  An Introduction to Regime Switching Time Series Models , 2009 .

[16]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[17]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[18]  Henrik Gollee,et al.  Human control of an inverted pendulum: Is continuous control necessary? Is intermittent control effective? Is intermittent control physiological? , 2011, The Journal of physiology.

[19]  Rik Pintelon,et al.  An Introduction to Identification , 2001 .

[20]  Qiang Yang,et al.  Partially Observable Markov Decision Process for Recommender Systems , 2016, ArXiv.

[21]  Robert Shorten,et al.  Delay-Tolerant Stochastic Algorithms for Parking Space Assignment , 2014, IEEE Transactions on Intelligent Transportation Systems.

[22]  William J. J. Roberts,et al.  An EM Algorithm for Markov Modulated Markov Processes , 2009, IEEE Transactions on Signal Processing.

[23]  V. Climenhaga Markov chains and mixing times , 2013 .

[24]  Robert Shorten,et al.  On Closed-Loop Bicycle Availability Prediction , 2015, IEEE Transactions on Intelligent Transportation Systems.

[25]  John Krumm,et al.  A Markov Model for Driver Turn Prediction , 2008 .

[26]  Biao Huang,et al.  System Identification , 2000, Control Theory for Physicists.

[27]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[28]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[29]  Joaquin Quiñonero Candela,et al.  Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..

[30]  Karthik Ramani,et al.  Deconvolving Feedback Loops in Recommender Systems , 2016, NIPS.

[31]  Hal R Varian,et al.  Causal inference in economics and marketing , 2016, Proceedings of the National Academy of Sciences.

[32]  Robert Shorten,et al.  Smart Cities: A Golden Age for Control Theory? [Industry Perspective] , 2016 .

[33]  Lennart Ljung,et al.  Closed-loop identification revisited , 1999, Autom..

[34]  Paul M. J. Van den Hof,et al.  Identification and control - Closed-loop issues , 1995, Autom..