February 2013

You are currently browsing the monthly archive for February 2013.

Check out

“Once human cognition is replaced, what else have we got? For the ultimate extreme example, imagine a robot that costs $5 to manufacture and can do everything you do, only better. You would be as obsolete as a horse.”




In the seminal paper “Gene Selection for Cancer Classification using Support Vector Machines“, Guyon, Weston, Barnhill, and Vapnik (2002) use Recursive Feature Elimination to find the genes which are the most predictive of cancer. Recursive Feature Elimination repeatedly ranks the features and eliminates the worst feature until only a small subset of the original set of features remains. Although several feature ranking methods were explored, the main method was a soft margin SVM classifier with which the authors found 8 key colon cancer genes out of 7000.

In “Temporal Autoencoding Restricted Boltzmann Machine“, Hausler and Susemihl explain how to train a deep belief RBM to learn to recognize patterns in sequences of inputs (mostly video).  The resulting networks could recognize the patterns in human motion capture or the non-linear dynamics of a bouncing ball.

In “A Survey of Monte Carlo Tree Search Methods“, Browne, Powley, Whitehouse, Lucas, Cowling, Rohlfshagen, Tavener, Perez, Samothrakis, and Colton (2012) wrote an extensive review of the variations of Monte Carlo Tree Search (MCTS) referencing 240 previous papers.  MCTS (specifically upper confidence trees (UCT)) was popularized by its unusual effectiveness in the game Go.  UCT significantly improved computer Go to the point where it is now competitive with professional Go players on small boards, but not on the standard 19×19 board. The paper updates and significantly extends the 2010 survey of MCTS for Go “Current Frontiers in Computer Go” by Rimmel, Teytaud, Lee, Yen, Wang, and Tsai.



“Monte Carlo Tree Search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm’s derivation, impart some structure on the many variations and enhancements that have been proposed, and summarise the results from the key game and non-game domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work.”


“In Section 2, we present central concepts of AI and games, introducing notation and terminology that set the stage for MCTS. In Section 3, the MCTS algorithm and its key components are described in detail. Section 4 summarises the main variations that have been proposed. Section 5 considers enhancements to the tree policy, used to navigate and construct the search tree.  Section 6 considers other enhancements, particularly to simulation and backpropagation steps. Section 7 surveys the key applications to which MCTS has been applied, both in games and in other domains. In Section 8, we summarise the paper to give a snapshot of the state of the art in MCTS research, the strengths and weaknesses of the approach, and open questions for future research.  The paper concludes with two tables that summarise the many variations and enhancements of MCTS and the domains to which they have been applied.”

“They’re probably going to render us extinct one day, so we might as well enjoy their servitude, while it lasts.”




Bengio and Lecun created this wonderful video on Deep Neural Networks.  Any logical function can be represented by a neural net with 3 layers (one hidden, see e.g. CNF), however simple 4 level logical functions with a small number of nodes may require a large number of nodes in a 3 layer representation.  They point to theorems that show that the number of nodes required to represent a k level logical function can require an exponential number of nodes in a k-1 level network. They go on to explain denoising auto encoders for the training of deep neural nets.

Check out Terence Tao‘s wonderful post “Matrix identities as derivatives of determinant identities“.

Check out “Recent Algorithms Development and Faster Belief Propagation algorithms” by Igor Carron at the Nuit Blanche blog.


Sean J. Taylor writes a short, critical, amusing article about RPython or JVM languages, Julia, Stata, SPSS,  Matlab, Mathematica, and SAS.

« Older entries