I’m rather fascinated by the HOO algorithm. It seems to be a combination of simulated annealing, branch and bound, simplex method, and upper confidence trees. I’m going to code it up an try it on a few problems.
You are currently browsing the archive for the Ensemble Learning category.
In “The Weighted Majority Algorithm” (1994) Littlestone and Warmuth describe an algorithm for combining several predictors together into a better predictor. The algorithm is very simillar to Adaboost (Freund and Schapire 1995).
Assume that there is an observed sequence of zeros and ones and a series of Turing machines which also output zeros and ones. Suppose that we want to predict the $i$-th value of the observed sequence given the first $i – 1$ of the observed sequence and the first $i$ values of the Turing machines. The weighted majority algorithm works by assigning initial weights of $1$ to each Turing machine and then multiplying the weight of the Turing machine by $\beta$, a number between 0 and 1, every time the machine output does not match the observed value. Then to predict the $i$-th value of the sequence, we 1) add up the weights of the Turing machines that predict $1$ and add up the weights of the $0$ predictors, 2) predict the value associated with the larger sum of weights.
For example, suppose that $\beta= 0.5$, the observed sequence is $01001000100001$, and there are three Turing machines which output $101010101$, $001001001$, and $101101110$ respectively.
For the first observation the weights are $1$, $1$, and $1$. The predictions are $1$, $0$, and $1$ respectively, so the weighted majority prediction is $1$, which is wrong!
The first and third Turing machines were wrong, so the new weights are $0.5$, $1.0$, and $0.5$. The next predicted values are $0$, $0$, and $0$. So the prediction is $0$.
All of the machines were wrong, so the new weights are $0.25$, $0.5$, and $0.25$. The third predicted values are $1$, $1$, and $1$. Once again, they are all wrong.
All of the machines were wrong, so the new weights are $0.125$, $0.25$, and $0.125$. The fourth predictions are $0$, $0$, and $1$. The weighted majority prediction is $0$ which is correct!
The new weights are $0.125$, $0.25$, and $0.0625$. The predictions are $1$, $0$, and $0$. The majority prediction is $0$ which is wrong.
According to Littleston and Warmuth, the maximum number of mistakes made by the weighed majority algorithm is
$$\log n + m\log 1 / \beta \over \log 2 / (1 + \beta)$$
where $m$ is the number of mistakes made by the best performing Turing machine and $n$ is the number of Turing machines. The bound is bounded below by $2m$ and the limit as $\beta$ approaches $1$ is $2m$.