Games

You are currently browsing the archive for the Games category.

In “Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent” by Pressa and Dyson (2012) show that there are strategies, named “zero determinant strategies” (ZD) which “win” against so-called evolutionary strategies in Prisoner’s Dilemma.  An easy to read, less mathematical summary of the paper is given in “Extortion and cooperation in the Prisoner’s Dilemma” by Stewart  and Plotkin.  Fans of Tit-for-Tat need not fear because Tit-for-Tat does not lose significantly against the ZD strategies. The idea of the ZD strategies is not very complicated.   The ZD player cooperates enough that it is profitable to cooperate with the ZD player, but the ZD player defects enough that 1) it is not profitable to defect against the ZD player, 2) the ZD player wins more than his opponent when the opponent cooperates.

I don’t play Magic, but if I did, this would be a cool thing to read.

In “Variance Reduction in Monte-Carlo Tree Search” Veness, Lanctot, and Bowling (2011) “examine the application of some standard techniques for variance reduction in” MonteCarlo Tree Search.  The variance reduction techniques included control variates, common random numbers, and antithetic variates. These techniques were applied to three games: Pig, Can’t Stop, and Dominion.  After reviewing Markov Decision Process and Upper Confidence Trees (UCT), they describe the variance reduction techniques and display charts showing the improvement of UCT for each technique and each game. The simplest game, Pig, gained the most from the variance reduction techniques, but all of the games gained from all of the techniques except no effective antithetic variate was found for Dominion.

Tetris is hard, but it is good for your brain. So machine learners have written a number of papers about the application of reinforcement learning to Tetris. Donald Carr (2005) reviews a number of algorithms for Tetris including temporal difference learning, genetic algorithms (Llima, 2005), hand-tuned algorithms specific to Tetris (Dellacherie, Fahey).  Szita and Lörincz (2006) in “Learning Tetris Using the Noisy Cross-Entropy Method” use noise to prevent early suboptimal convergence of the fascinating cross entropy method (similar to particle swarm optimization).  Much like chess, the value of the current state of a Tetris game is estimated with a linear combination of 22 features (for more details, check out the seminal Tetris paper “Feature-Based Methods for Large Scale Dynamic Programming” by Bertsekas and Tsitsiklis 1996.) Their noisy CE method produced solutions almost as good at the best contemporaneous hand-tuned Tetris algorithms.

Someone at stack overflow asked for advice about Machine Learning for checkers. Here is my response:

You might want to take a look at the following: Chinook, Upper Confidence Trees, Reinforcement Learning, and Alpha-Beta pruning.  I personally like to combine Alpha-Beta Pruning and Upper Confidence Trees (UCT) for perfect information games where each player has less than 10 reasonable moves.  You can use Temporal Difference Learning to create a position evaluation function.  Game AI is probably the funnest way to learn machine learning.

PS:  There is a great story about the former human world champion, Marion Tinsley,  and the computer world champion here

https://www.theatlantic.com/technology/archive/2017/07/marion-tinsley-checkers/534111/

Here is a cool set of power point slides by Brodeur and Child on upper confidence trees (UCT) applied to Go.  They explore modifications to UCT to explore directed acyclic graphs and grouping of game states.

Here is fun paper solving 4×4 minesweeper via the Bellman MDP algoritm.

Newer entries »