I love this diagram created by Peekaboo: Andy’s Computer Vision and Machine Learning Blog.

We're blogging machines!

You are currently browsing the monthly archive for **March 2013**.

In “An Empirical Evaluation of Thompson Sampling”, Chapelle and Lihong Li (NIPS 2011) compare Upper Confidence Bound (UCB), Gittins, and Thompson methods for the multi-armed bandit problem. More theoretical analysis has been done for UCB and Gittins, but Thompson sampling is simple (as opposed to Gittins) and seems to work well, sometimes performing significantly better that the other common algorithms. Both synthetic and real world (advertising) results are presented.

Thanks to Nuit Blanche for pointing me towards the paper “Exact Sparse Recovery with L0 Projections” by Ping Li and Cun-Hui Zhang (2013). The paper describes an algorithm (Min+Gap(3)), which perfectly recovers $x\in R^N$ from $M_0$ random measurements $y_i\in R$. The vector $y$ is generated by

$y = S x$

where $S\in R^{N\times M},\ ||x||_0 = K,\ K\ll N,\ s_{ij} \sim S(\alpha, 1)$, and $S(\alpha, 1)$ denotes an $\alpha$-stable distribution with unit scale. Typically the number of measurements required for exact recovery is “about $5 K$”. More specifically,

“$M_0 = K \log ((N − K)/\delta)$ measurements are sufﬁcient for detecting all zeros with at least probability $1 − \delta$.”

Li and Zhang illustrate the effectiveness of their algorithm by applying it to a simple video stream exactly recovering the video from measurements constructed from the differences between frames. They also discuss an application to finding “heavy hitters” (“elephant detection”) in marketing data (see e.g “Compressive sensing medium access control for wireless lans” Globecom, 2012 and “A data streaming algorithm for estimating entropies of od ﬂows” IMC, San Diego, CA, 2007.).

The idea for the algorithm seems to have stemmed from Piotr Indyk’s paper “Stable distributions, pseudorandom generators, embeddings, and data stream computation.” Li and Zang state that Indyk estimated “the $\alpha$-th frequency moment $\sum_{i=1}^N |x_i|^\alpha$” using random measurements generated from an $\alpha$-stable distribution instead of the harder problem of recovering $x$ exactly.

In numerical simulations, they compare their algorithm to orthogonal matching pursuit and linear programming. Additionally, detailed proofs of the their algorithm’s effectiveness are provided.

Thanks to Nuit Blanche for pointing me towards the presentation by Andrea Montanari “Collaborative Filtering: Models and Algorithms” and the associated Deshpande and Montanari paper “Linear Bandits in High Dimension and Recommendation Systems” (2012). In the presentation, Montanari reviews Spectral, Gradient Descent, Stochasitc Gradient Descent, Convex Relaxation, and Linear Bandit methods for approximating the standard linear model for recommendation systems and some accuracy guarantees.

Assuming the $j$th movie has features $v_{j1}, v_{j2}, \ldots, v_{jr}$, then the $i$th viewer gives the rating

$R_{ij} = \langle u_i, v_j \rangle +\epsilon_{ij}$

where $u_i$ is an $r$ dimensional vector representing the preferences of $i$th viewer and $\epsilon_{ij}$ is Gaussian noise.

The paper introduces a new Linear Bandit method, Smooth Explore, better suited for recommendation systems. Their method is motivated by the three objectives:

- Constant-optimal cumulative reward,
- Constant-optimal regret, and
- Approximate monotonicity (rewards approximately increase with time).

Smooth Explore estimates the user preferences vectors with a regularized least squares regression. Proofs of optimality and numerical results are provided.

This Russian rover traveled over 20 miles on the moon in 1973!