论文信息 - The Geometry of Generalized Binary Search

The Geometry of Generalized Binary Search

This paper investigates the problem of determining a binary-valued function through a sequence of strategically selected queries. The focus is an algorithm called Generalized Binary Search (GBS). GBS is a well-known greedy algorithm for determining a binary-valued function through a sequence of strategically selected queries. At each step, a query is selected that most evenly splits the hypotheses under consideration into two disjoint subsets, a natural generalization of the idea underlying classic binary search. This paper develops novel incoherence and geometric conditions under which GBS achieves the information-theoretically optimal query complexity; i.e., given a collection of N hypotheses, GBS terminates with the correct function after no more than a constant times logN queries. Furthermore, a noise-tolerant version of GBS is developed that also achieves the optimal query complexity. These results are applied to learning halfspaces, a problem arising routinely in image processing and machine learning.

Robert D. Nowak | R. Nowak

[1] R. Buck. Partition of Space , 1943 .

[2] Michael Horstein,et al. Sequential transmission using noiseless feedback , 1963, IEEE Trans. Inf. Theory.

[3] A. Rényi. On the Foundations of Information Theory , 1965 .

[4] R. Rhine. Some problems in dissonance theory research on information selectivity. , 1967, Psychological bulletin.

[5] A. Rényi,et al. Selected papers of Alfréd Rényi , 1976 .

[6] Ronald L. Rivest,et al. Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[7] Joel H. Spencer,et al. Coping with Errors in Binary Search Procedures , 1980, J. Comput. Syst. Sci..

[8] Henryk Wozniakowski,et al. Information-based complexity , 1987, Nature.

[9] Padhraic Smyth,et al. Decision tree design from a communication theory standpoint , 1988, IEEE Trans. Inf. Theory.

[10] H. Woxniakowski. Information-Based Complexity , 1988 .

[11] Dana Angluin,et al. Queries and concept learning , 1988, Machine Learning.

[12] Javed A. Aslam,et al. Searching in the presence of linearly bounded errors , 1991, STOC '91.

[13] Peter Winkler,et al. On playing “Twenty Questions” with a liar , 1992, SODA '92.

[14] Joel H. Spencer,et al. Ulam's Searching Game with a Fixed Number of Lies , 1992, Theor. Comput. Sci..

[15] Steven Skiena,et al. Decision trees for geometric models , 1993, SCG '93.

[16] Eli Upfal,et al. Computing with Noisy Information , 1994, SIAM J. Comput..

[17] Tibor Hegedüs,et al. Generalized Teaching Dimensions and the Query Complexity of Learning , 1995, COLT.

[18] Lisa Hellerstein,et al. How many queries are needed to learn? , 1995, JACM.

[19] Donald Geman,et al. An Active Testing Model for Tracking Roads in Satellite Images , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[20] Teresa M. Przytycka,et al. On an Optimal Split Tree Problem , 1999, WADS.

[21] A. Korostelev. On minimax rates of convergence in image models under sequential design , 1999 .

[22] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[23] John Odentrantz,et al. Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.

[24] A. Korostelev,et al. Rates of convergence for the sup-norm risk in image models under sequential designs , 2000 .

[25] Andrzej Pelc,et al. Searching games with errors - fifty years of coping with liars , 2002, Theor. Comput. Sci..

[26] H. Sebastian Seung,et al. Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[27] John N. Tsitsiklis,et al. Active Learning Using Arbitrary Binary Valued Queries , 1993, Machine Learning.

[28] Ronald L. Graham,et al. Performance bounds on the splitting algorithm for binary testing , 1974, Acta Informatica.

[29] Dana Angluin. Queries revisited , 2004, Theor. Comput. Sci..

[30] Sanjoy Dasgupta,et al. Analysis of a greedy active learning strategy , 2004, NIPS.

[31] Donald W. Loveland. Performance bounds for binary testing with arbitrary weights , 2004, Acta Informatica.

[32] Michael J. Swain,et al. Promising directions in active vision , 1993, International Journal of Computer Vision.

[33] Sanjoy Dasgupta,et al. Coarse sample complexity bounds for active learning , 2005, NIPS.

[34] Emmanuel J. Candès,et al. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[35] Matti Kääriäinen,et al. Active Learning in the Non-realizable Case , 2006, ALT.

[36] Richard M. Karp,et al. Noisy binary search and its applications , 2007, SODA '07.

[37] Maria-Florina Balcan,et al. Margin Based Active Learning , 2007, COLT.

[38] Mark Burgin,et al. Foundations of Information Theory , 2008, ArXiv.

[39] R. Nowak,et al. Generalized binary search , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.

[40] Robert D. Nowak,et al. Minimax Bounds for Active Learning , 2007, IEEE Transactions on Information Theory.

[41] Ronald L. Rivest,et al. Introduction to Algorithms, third edition , 2009 .