December 2014

You are currently browsing the monthly archive for December 2014.

Fernandez-Delgado, Cernadas, Barro, and Amorim tested 179 classfiers on 121 data sets and reported their results in “Do we Need Hundreds of Classi ers to Solve Real World Classi cation Problems?”  The classifiers were drawn from the following 17 families:

“discriminant analysis, Bayesian, neural networks, support vector machines, decision trees, rule-based classi fiers, boosting, bagging, stacking, random forests and other ensembles, generalized linear models, nearest-neighbors, partial least squares and principal component regression, logistic and multinomial regression, multiple adaptive regression splines and other methods”

from the Weka, Matlab, and R machine learning libraries.  The 121 datasets were drawn mostly from the UCI classification repository.

The overall result was that the random forest classifiers were best on average followed by support vector machines, neural networks, and boosting ensembles.

For more details, read the paper!

Christopher Clark and Amos Storkey wrote an interesting nine page article titled “Teaching Deep Convolutional Neural Networks to Play Go”.  Their deep neural network correctly predicted the moves of experts on a 19×19 Go about 44% of the time.  The previous record was 41% by Wistuba and Schmidt-Thieme in 2012.  Furthermore, the Clark Storkey network was able to “consistently defeat the well-known Go program GNU Go.”  This is the first time that a neural network was able to perform nearly as well as one of the better hand coded programs.  It is still not as good at the better UCT programs, but it moves much more quickly than the UCT programs.  I imagine that if there were a blitz version of computer Go, the Clark Storkey AI might win a computer competition.

The article reviews other recent attempts to train a neural network to play Go.  The Clark Storkey network resembled the Wistuba Schmidt-Thieme network, but it had more 19×19 convolutional layers and the authors added one fully connected layer at the top before the final move decision.  Also, known symmetries of the solution were hard-coded.  Interestingly, they found that convolution seemed to be required.

“We briefly experimented with non-convolutional networks but found them to be much harder to train, often requiring more epochs of training and the use of approximate second order gradient descent methods, while getting worse results.”

Later they describe their training methods and network architecture as follows

“Networks were trained with mini-batch gradient descent with a batch size of 128, using a learning rate of 0.01 for 7 epochs, and 0.05 for 2 epochs which took about a day on a Nvidia GTX 780 GPU.”

“The best network had one convolutional layer with 64 7×7 filters, two convolutional layers with 64 5×5 filters, two layers with 48 5×5 filters, two layers with 32 5×5 filters, and one fully connected layer.”

They estimate that their AI would probably have a ranking near 4-5 kyu.

 

Mnih, Kavukcuoglu, Silver, Graves, Antonoglon, Wierstra, and Riedmiller authored the paper “Playing Atari with Deep Reinforcement Learning” which describes and an Atari game playing program created by the company Deep Mind (recently acquired by Google). The AI did not just learn how to pay one game. It learned to play seven Atari games without game specific direction from the programmers. (The same learning parameters, neural network topologies, and algorithms were used for every game).

The 2600 Atari gaming system was quite popular in the late 1970’s and the early 1980’s. The games ran with only four kilobytes of RAM and a 210 x 160 pixel display with 128 colors. Various machine learning techniques have been applied to the old Atari games using the Arcade Learning Environment which precisely reproduces the Atari 2600 gaming system. (See e.g. “An Object-Oriented Representation for Efficient Reinforcement Learning” by Diuk, Cohen, and Littman 2008, ”HyperNEAT-GGP:A HyperNEAT-based Atari General Game Player” by Hausknecht, Khandelwal, Miikkulainen, and Stone 2012, “Application of TEXPLORE on Atari Games “ by Shung Zhang , ”A Neuroevolution Approach to General Atari Game Playing” by Hausknect, Lehman,” Miikkulalianen, and Stone 2014, and “Replicating the Paper ‘Playing Atari with Deep Reinforcement Learning’ ” by Korjus, Kuzovkin, Tampuu, and Pungas 2014.)

Various papers have been written on how computers can learn to pay the Atari games, but most of them used the abstract representations of objects on the screen within the emulator. The Mnih et al AI learned to play the games using only the raw 210 x 160 video and the score. It seems to be the first successful attempt to learn arcade gaming from raw video.

To learn from raw video, they first converted the video to grayscale and then downsampled/cropped to 84 x 84 images. The last four frames were used to determine actions. The 28224 input pixels were run through two hidden convolution neural net layers and one fully connected (no convolution) 256 node hidden layer with a single output for each possible action. Training was done with stochastic gradient decent using random samples drawn from a historical database of previous games played by the AI to improve convergence   (This technique known as “experience replay” is described in “Reinforcement learning for robots using neural nets” Long-Ji Lin 1993.)

The objective function for supervised learning is usually a loss function representing the difference between the predicted label and the actual label. For these games the correct action is unknown, so reinforcement learning is used instead of supervised learning. The authors used a variant of Q-learning to train the weights in their neural network. They describe their algorithm in detail and compare it to several historical reinforcement algorithms, so this section of the paper can be used as a brief introduction to reinforcement learning.

The AI was trained to play seven games: Beam Rider, Breakout, Enduro, Pong, Q*bert, Seaquest, and Space Invaders. In six of the seven games, this general game learning algorithm outperformed all previously known reinforcement learning algorithms tested on those games and “surpasses a human expert on three” of the seven games.