Tetris, Reinforcement Learning, The Cross-Entropy method

Tetris is hard, but it is good for your brain. So machine learners have written a number of papers about the application of reinforcement learning to Tetris. Donald Carr (2005) reviews a number of algorithms for Tetris including temporal difference learning, genetic algorithms (Llima, 2005), hand-tuned algorithms specific to Tetris (Dellacherie, Fahey).  Szita and Lörincz (2006) in “Learning Tetris Using the Noisy Cross-Entropy Method” use noise to prevent early suboptimal convergence of the fascinating cross entropy method (similar to particle swarm optimization).  Much like chess, the value of the current state of a Tetris game is estimated with a linear combination of 22 features (for more details, check out the seminal Tetris paper “Feature-Based Methods for Large Scale Dynamic Programming” by Bertsekas and Tsitsiklis 1996.) Their noisy CE method produced solutions almost as good at the best contemporaneous hand-tuned Tetris algorithms.