Bubeck Slides “Continuous Stochastic Optimization” using Hierarchical Optimistic Optimization (HOO)
X-Armed Bandits (HOO)
Sample-Based Planning for Continuous Action for Markov Decision Processes
Prediction, Learning, and Games (Book by CESA-BIANCH & LUGOS )
The non-stochastic multi-armed bandit problem