This paper by Kuleshov and Doina gives some practical advice and empirical evidence for multi-armed bandit (MAB) algorithms. They conduct several experiments in the standard MAB setting with $\epsilon$-greedy, upper confidence bound, learning, and Boltzmann algorithms. Their time horizon is only 1000 pulls, so, in general, the $\epsilon$-greedy and Boltzmann algorithms did better than the logarithmic regret algorithms.