Learning to model sequences generated by switching distributions

We study efficient algorithms for solving the following problem, which we call the switching distributions learning problem. A sequence S = 1 2 : : : n, over a finite alphabet Σ is generated in the following way. The sequence is a concatenation of K runs, each of which is a consecutive subsequence. Each run is generated by independent random draws from a distribution ~pi over Σ, where ~ pi is an element in a set of distributions f~p1; : : : ; ~pNg. The learning algorithm is given this sequence and its goal is to find approximations of the distributions ~p1; : : : ; ~pN , and give an approximate segmentation of the sequence into its constituting runs. We give an efficient algorithm for solving this problem and show conditions under which the algorithm is guaranteed to work with high probability.