Bandits Games and Clustering Foundations
暂无分享,去创建一个
[1] Rémi Munos,et al. Pure exploration in finitely-armed and continuous-armed bandits , 2011, Theor. Comput. Sci..
[2] Robert D. Kleinberg,et al. Regret bounds for sleeping experts and bandits , 2010, Machine Learning.
[3] R. Munos,et al. Best Arm Identification in Multi-Armed Bandits , 2010, COLT.
[4] John N. Tsitsiklis,et al. Linearly Parameterized Bandits , 2008, Math. Oper. Res..
[5] J. Rissanen. Minimum Description Length Principle , 2010, Encyclopedia of Machine Learning.
[6] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[7] Ulrike von Luxburg,et al. Nearest Neighbor Clustering: A Baseline Method for Consistent Clustering with Arbitrary Objective Functions , 2009, J. Mach. Learn. Res..
[8] Rémi Munos,et al. Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.
[9] Aurélien Garivier,et al. Regret Bounds for Opportunistic Channel Access , 2009, ArXiv.
[10] Jean-Yves Audibert,et al. Minimax Policies for Adversarial and Stochastic Bandits. , 2009, COLT 2009.
[11] Csaba Szepesvári,et al. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits , 2009, Theor. Comput. Sci..
[12] Jean-Yves Audibert,et al. Minimax Policies for Bandits Games , 2009, COLT 2009.
[13] Alessandro Lazaric,et al. Hybrid Stochastic-Adversarial On-line Learning , 2009, COLT.
[14] Ohad Shamir,et al. On the Reliability of Clustering Stability in the Large Sample Regime , 2008, NIPS.
[15] Filip Radlinski,et al. Mortal Multi-Armed Bandits , 2008, NIPS.
[16] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[17] Rémi Munos,et al. Algorithms for Infinitely Many-Armed Bandits , 2008, NIPS.
[18] Csaba Szepesvári,et al. Online Optimization in X-Armed Bandits , 2008, NIPS.
[19] Rémi Munos,et al. Optimistic Planning of Deterministic Systems , 2008, EWRL.
[20] H. Jaap van den Herik,et al. Progressive Strategies for Monte-Carlo Tree Search , 2008 .
[21] Varun Grover,et al. Active Learning in Multi-armed Bandits , 2008, ALT.
[22] Yngvi Björnsson,et al. Simulation-Based Approach to General Game Playing , 2008, AAAI.
[23] David Silver,et al. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Achieving Master Level Play in 9 × 9 Computer Go , 2022 .
[24] Csaba Szepesvári,et al. Empirical Bernstein stopping , 2008, ICML '08.
[25] Anthony K. H. Tung,et al. Estimating local optimums in EM algorithm over Gaussian mixture model , 2008, ICML '08.
[26] Shai Ben-David,et al. Relating Clustering Stability to Properties of Cluster Boundaries , 2008, COLT.
[27] Eli Upfal,et al. Adapting to a Changing Environment: the Brownian Restless Bandits , 2008, COLT.
[28] Qing Zhao,et al. A Restless Bandit Formulation of Opportunistic Access: Indexablity and Index Policy , 2008, 2008 5th IEEE Annual Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks Workshops.
[29] Aurélien Garivier,et al. On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems , 2008, 0805.3415.
[30] Eli Upfal,et al. Multi-Armed Bandits in Metric Spaces ∗ , 2008 .
[31] Mikhail Belkin,et al. Consistency of spectral clustering , 2008, 0804.0678.
[32] Naftali Tishby,et al. Model Selection and Stability in k-means Clustering , 2008, Annual Conference Computational Learning Theory.
[33] Elad Hazan,et al. Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization , 2008, COLT.
[34] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[35] Ulrike von Luxburg,et al. Consistent Minimization of Clustering Objective Functions , 2007, NIPS.
[36] Ohad Shamir,et al. Cluster Stability for Finite Samples , 2007, NIPS.
[37] Nan Rong,et al. What makes some POMDP problems easy to approximate? , 2007, NIPS.
[38] Ulrike von Luxburg,et al. A tutorial on spectral clustering , 2007, Stat. Comput..
[39] Stefanie Jegelka. Statistical Learning Theory Approaches to Clustering , 2007 .
[40] Deepayan Chakrabarti,et al. Multi-armed bandit problems with dependent arms , 2007, ICML '07.
[41] David Silver,et al. Combining online and offline knowledge in UCT , 2007, ICML '07.
[42] Shai Ben-David,et al. Stability of k -Means Clustering , 2007, COLT.
[43] Peter Auer,et al. Improved Rates for the Stochastic Continuum-Armed Bandit Problem , 2007, COLT.
[44] Sanjoy Dasgupta,et al. A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians , 2007, J. Mach. Learn. Res..
[45] Tamás Linder,et al. The On-Line Shortest Path Problem Under Partial Monitoring , 2007, J. Mach. Learn. Res..
[46] Rémi Munos,et al. Bandit Algorithms for Tree Search , 2007, UAI.
[47] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.
[48] Artur Czumaj,et al. Sublinear‐time approximation algorithms for clustering via random sampling , 2007, Random Struct. Algorithms.
[49] Shai Ben-David,et al. A framework for statistical clustering with constant time approximation algorithms for K-median and K-means clustering , 2007, Machine Learning.
[50] R. Ostrovsky,et al. The Effectiveness of Lloyd-Type Methods for the k-Means Problem , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).
[51] Peter Auer,et al. Hannan Consistency in On-Line Learning in Case of Unbounded Losses Under Partial Monitoring , 2006, ALT.
[52] Stephen F. Smith,et al. A Simple Distribution-Free Approach to the Max k-Armed Bandit Problem , 2006, CP.
[53] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[54] Stephen F. Smith,et al. An Asymptotically Optimal Algorithm for the Max k-Armed Bandit Problem , 2006, AAAI.
[55] Gregory Shakhnarovich,et al. An investigation of computational and informational limits in Gaussian mixture clustering , 2006, ICML '06.
[56] Shai Ben-David,et al. A Sober Look at Clustering Stability , 2006, COLT.
[57] M. Newman,et al. Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.
[58] K. Schlag. ELEVEN - Tests needed for a Recommendation , 2006 .
[59] Olivier Teytaud,et al. Modification of UCT with Patterns in Monte-Carlo Go , 2006 .
[60] Q. Zhao,et al. Decentralized cognitive mac for dynamic spectrum access , 2005, First IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks, 2005. DySPAN 2005..
[61] Gilles Stoltz. Incomplete information and internal regret in prediction of individual sequences , 2005 .
[62] H. Vincent Poor,et al. Bandit problems with side observations , 2005, IEEE Transactions on Automatic Control.
[63] Gábor Lugosi,et al. Minimizing regret with label efficient prediction , 2004, IEEE Transactions on Information Theory.
[64] U. V. Luxburg,et al. Towards a Statistical Theory of Clustering , 2005 .
[65] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[66] Robert D. Kleinberg. Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.
[67] Russell Greiner,et al. The Budgeted Multi-armed Bandit Problem , 2004, COLT.
[68] Avrim Blum,et al. Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary , 2004, COLT.
[69] Baruch Awerbuch,et al. Adaptive routing with end-to-end feedback: distributed learning and geometric approaches , 2004, STOC '04.
[70] Joachim M. Buhmann,et al. Stability-Based Validation of Clustering Solutions , 2004, Neural Computation.
[71] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[72] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[73] John N. Tsitsiklis,et al. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem , 2004, J. Mach. Learn. Res..
[74] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[75] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[76] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.
[77] Anil K. Jain,et al. Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[78] Irini Angelidaki,et al. Anaerobic digestion model No. 1 (ADM1) , 2002 .
[79] Leonard Pitt,et al. Sublinear time approximate clustering , 2001, SODA '01.
[80] Luc Devroye,et al. Combinatorial methods in density estimation , 2001, Springer series in statistics.
[81] Y. Freund,et al. The non-stochastic multi-armed bandit problem , 2001 .
[82] Santosh S. Vempala,et al. On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.
[83] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[84] Piotr Indyk,et al. Sublinear time algorithms for metric space problems , 1999, STOC '99.
[85] Stephen Guattery,et al. On the Quality of Spectral Separators , 1998, SIAM J. Matrix Anal. Appl..
[86] Joachim M. Buhmann,et al. Grosser Systeme Echtzeitoptimierung Schwerpunktprogramm Der Deutschen Forschungsgemeinschaft Empirical Risk Approximation: an Induction Principle for Unsupervised Learning , 2022 .
[87] Adrian E. Raftery,et al. How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..
[88] Nicolò Cesa-Bianchi,et al. Analysis of two gradient-based algorithms for on-line regression , 1997, COLT '97.
[89] Robert W. Chen,et al. Bandit problems with infinitely many arms , 1997 .
[90] Dean P. Foster,et al. Calibrated Learning and Correlated Equilibrium , 1997 .
[91] Shang-Hua Teng,et al. Spectral partitioning works: planar graphs and finite element meshes , 1996, Proceedings of 37th Conference on Foundations of Computer Science.
[92] Jon A. Wellner,et al. Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .
[93] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.
[94] R. Agrawal. Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.
[95] R. Agrawal. The Continuum-Armed Bandit Problem , 1995 .
[96] Nicolò Cesa-Bianchi,et al. Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.
[97] M. Inaba. Application of weighted Voronoi diagrams and randomization to variance-based k-clustering , 1994, SoCG 1994.
[98] Yoshua Bengio,et al. Convergence Properties of the K-Means Algorithms , 1994, NIPS.
[99] Andrew W. Moore,et al. Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation , 1993, NIPS.
[100] Dorothea Wagner,et al. Between Min Cut and Graph Bisection , 1993, MFCS.
[101] David Haussler,et al. How to use expert advice , 1993, STOC.
[102] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[103] Manfred K. Warmuth,et al. The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.
[104] Colin McDiarmid,et al. Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .
[105] P. Whittle. Restless bandits: activity allocation in a changing world , 1988, Journal of Applied Probability.
[106] J. Hartigan. Statistical theory in clustering , 1985 .
[107] David B. Shmoys,et al. A Best Possible Heuristic for the k-Center Problem , 1985, Math. Oper. Res..
[108] Michael Randolph Garey,et al. The complexity of the generalized Lloyd - Max problem , 1982, IEEE Trans. Inf. Theory.
[109] J. Hartigan. Consistency of Single Linkage for High-Density Clusters , 1981 .
[110] D. Pollard. Strong Consistency of $K$-Means Clustering , 1981 .
[111] József Fritz,et al. Distribution-free exponential error bound for nearest neighbor pattern classification , 1975, IEEE Trans. Inf. Theory.
[112] D. Freedman. On Tail Probabilities for Martingales , 1975 .
[113] W. Hoeffding. Probability inequalities for sum of bounded random variables , 1963 .
[114] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[115] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[116] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .