Transfer of Samples in Policy Search via Multiple Importance Sampling

Algorithm 1 requires a measure of ESS in order to evaluate the quality of a gradient estimate and, consequently, to adapt the batch size. Although several ESS measures for IS have been studied (see, e.g., (Martino et al., 2017)), to the best of our knowledge no measure specifically designed for MIS estimators has been proposed. A recent work by Elvira et al. (2018) has analyzed the classical ESS measure introduced in Section 2 and has empirically demonstrated its effectiveness in MIS. Thus, we have decided to apply it to our context as well. However, since for our application we are satisfied with a lower bound on the ESS, instead of taking the variance of the importance weights under the given proposals, we take it w.r.t. the mixture of these. This is motivated by the following proposition, which follows directly from the fact that the former variance is always smaller than the latter (see (Owen & Zhou, 2000) or Lemma C.1 in Appendix C.5).