Artificial Intelligence Blog

We're blogging machines!

  • Home
  • About
  • Most popular posts
  • 100 Theorems

Subscribe to feed

‹ Support Vector Machines — Better than Artificial Neural Networks in which learning situations?  •  YouTube Vowpal Wabbit and Spark Tutorials from NIPS 2011 ›

“From Bandits to Experts: On the Value of Side-Observations”

September 30, 2012 in Multi-Armed Bandit Problem by hundalhh | Permalink

I was reading the Machine Learning‘s article “Coactive Learning” and they referred to that paper “From Bandits to Experts: On the Value of Side-Observations” by Mannor and Shamir (2011). This paper develops algorithms for the situation where the learner gets information about neighboring bandits after it chooses which bandit arm to pull. Recall that in the mixture of experts situation, the leaner gets to see the results of all the experts (bandits) after choosing which arm to pull.

Related Posts via Categories

  • Link/Book(pdf): “Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems”
  • Link: AlphaGo Wins against one of the best Go Players on the Planet
  • Ted Dunning on Bandits
  • “Stochastic Superoptimization” and “Programming by Optimization”
  • “Generalized Thompson sampling for sequential decision-making and causal inference”
  • “Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained”
  • “An Empirical Evaluation of Thompson Sampling”
  • “Linear Bandits in High Dimension and Recommendation Systems”
  • “A Survey of Monte Carlo Tree Search Methods”
  • Schwarzenegger Bandit Success Formula

Categories

  • Abstraction for Learning (9)
  • Assorted Links (3)
  • Astronomy (6)
  • Category Theory (7)
  • Clustering (8)
  • Complexity (5)
  • Compressed Sensing (1)
  • Control Systems (1)
  • Deep Belief Networks (28)
  • Economics (2)
  • Ensemble Learning (12)
  • Games (48)
  • General ML (39)
  • Graphical Models (16)
  • Information Theory (11)
  • Investing (2)
  • Languages (14)
  • Logic (7)
  • Math (46)
  • Multi-Armed Bandit Problem (27)
  • Neural Nets (28)
  • Optimization (20)
  • PDEs (1)
  • Programming (8)
  • Reinforcement Learning (15)
  • Robots (13)
  • Sparsity (5)
  • Statistics (21)
  • Support Vector Machines (3)
  • Technology (6)
  • Uncategorized (26)

Archives

  • February 2025 (1)
  • January 2025 (1)
  • August 2024 (1)
  • February 2024 (5)
  • January 2024 (1)
  • November 2023 (1)
  • October 2023 (2)
  • September 2023 (1)
  • June 2023 (1)
  • May 2023 (4)
  • April 2023 (1)
  • March 2023 (1)
  • February 2023 (4)
  • January 2023 (2)
  • December 2022 (1)
  • July 2022 (2)
  • June 2022 (1)
  • April 2022 (1)
  • May 2021 (2)
  • April 2021 (1)
  • March 2021 (1)
  • February 2021 (3)
  • January 2021 (1)
  • December 2020 (3)
  • October 2020 (1)
  • July 2020 (1)
  • May 2020 (1)
  • April 2020 (1)
  • May 2019 (1)
  • September 2018 (1)
  • August 2018 (1)
  • May 2017 (1)
  • April 2017 (2)
  • April 2016 (3)
  • March 2016 (3)
  • November 2015 (1)
  • May 2015 (1)
  • March 2015 (2)
  • January 2015 (2)
  • December 2014 (3)
  • November 2014 (1)
  • September 2014 (2)
  • August 2014 (3)
  • July 2014 (2)
  • June 2014 (1)
  • April 2014 (2)
  • March 2014 (5)
  • February 2014 (1)
  • January 2014 (3)
  • December 2013 (5)
  • November 2013 (4)
  • October 2013 (5)
  • September 2013 (5)
  • August 2013 (4)
  • July 2013 (5)
  • June 2013 (4)
  • May 2013 (2)
  • April 2013 (14)
  • March 2013 (15)
  • February 2013 (14)
  • January 2013 (18)
  • December 2012 (17)
  • November 2012 (19)
  • October 2012 (15)
  • September 2012 (22)
  • August 2012 (26)
  • July 2012 (18)

Subscribe to feed

Powered by WordPress and Tarski