Sleeping Experts

September 10, 2012 in Multi-Armed Bandit Problem by hundalhh | Permalink

In “Regret Bounds for Sleeping Experts and Bandits” Kleinberg, Niculescu-Mizil, and Sharma (2008) propose an algorithm for selecting among a list of experts some of whom may be sleeping. At each time step $t$, the algorithm ranks the experts in order of preference with a permutation vector $\sigma(t)$. The first analyzed algorithm is Follow The Awake Leader (FTAL) for the case where all of the expert predictions and rewards are observed. Next they explore a natural extension of upper confidence bound algorithm for the multi-armed bandit setting. In the adversarial setting, they prove that the regret is at least $O(\sqrt{t\, K \log K})$ for $K$ bandits at time $t$.

Artificial Intelligence Blog

Sleeping Experts

Categories

Archives

Subscribe to ArtEnt via Email

Artificial Intelligence Blog

Sleeping Experts

Related Posts via Categories

Categories

Archives

Subscribe to ArtEnt via Email