Optimal Sequential Maximization: One Interview is Enough!

Maximum selection under probabilistic queries (probabilistic maximization) is a fundamental algorithmic problem arising in numerous theoretical and practical contexts. We derive the first query-optimal sequential algorithm for probabilistic-maximization. Departing from previous assumptions, the algorithm and performance guarantees apply even for infinitely many items, hence in particular do not require a-priori knowledge of the number of items. The algorithm has linear query complexity, and is optimal also in the streaming setting. To derive these results we consider a probabilistic setting where several candidates for a position are asked multiple questions with the goal of finding who has the highest probability of answering interview questions correctly. Previous work minimized the total number of questions asked by alternating back and forth between the best performing candidates, in a sense, inviting them to multiple interviews. We show that the same order-wise selection accuracy can be achieved by querying the candidates sequentially, never returning to a previously queried candidate. Hence one interview is enough!

[1]  Andrew Caplin,et al.  Aggregation and Social Choice: A Mean Voter Theorem , 1991 .

[2]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[3]  Nahum Shimkin,et al.  Infinitely Many-Armed Bandits with Unknown Value Distribution , 2014, ECML/PKDD.

[4]  R. Munos,et al.  Best Arm Identification in Multi-Armed Bandits , 2010, COLT.

[5]  Shie Mannor,et al.  Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..

[6]  Eyke Hüllermeier,et al.  Online Rank Elicitation for Plackett-Luce: A Dueling Bandits Approach , 2015, NIPS.

[7]  Thorsten Joachims,et al.  Beat the Mean Bandit , 2011, ICML.

[8]  Thorsten Joachims,et al.  The K-armed Dueling Bandits Problem , 2012, COLT.

[9]  H. Robbins Some aspects of the sequential design of experiments , 1952 .

[10]  Roi Livni,et al.  Multi-Armed Bandits with Metric Movement Costs , 2017, NIPS.

[11]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[12]  Alon Orlitsky,et al.  Maximum Selection and Ranking under Noisy Comparisons , 2017, ICML.

[13]  Alessandro Lazaric,et al.  Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence , 2012, NIPS.

[14]  Yuval Peres,et al.  Bandits with switching costs: T2/3 regret , 2013, STOC.

[15]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[16]  Alon Orlitsky,et al.  Maxing and Ranking with Few Assumptions , 2017, NIPS.

[17]  Rémi Munos,et al.  Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.

[18]  John N. Tsitsiklis,et al.  The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..

[19]  Xi Chen,et al.  Optimal PAC Multiple Arm Identification with Applications to Crowdsourcing , 2014, ICML.

[20]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[21]  Jeffrey S. Foster,et al.  The Diverse Cohort Selection Problem: Multi-Armed Bandits with Varied Pulls , 2017, ArXiv.

[22]  Alon Orlitsky,et al.  The Limits of Maxing, Ranking, and Preference Learning , 2018, ICML.

[23]  Oren Somekh,et al.  Almost Optimal Exploration in Multi-Armed Bandits , 2013, ICML.