Games

You are currently browsing the archive for the Games category.

I often have to make a decision when putting at the edge of the green during disc golf.  Should I try to put the disc in the basket (choice 1) or should I just try to put the disc near the basket (choice 2)? If I try to put it in the basket and I miss, sometimes the disc will fly or roll so far way that I miss the next shot.

In order to answer this question, I created a simplified model.  Assume:

  • If I don’t try to put the disc in the basket (choice 2), I will land near the basket and get it in the basket on the next throw.
  • If I do try to put it in the basket (choice 1), I will succeed with probability $p$, where $p$ depends only on distance to the basket.
  • If I do try to put it in the basket and fail to get it in the basket (choice 1), the probability that I will succeed on the second try is $q$ where $q$ is constant which does not depend on distance.
  • If I do try to put it in the basket (choice 1), fail to get it in the basket, and then fail again on my second try, then I will always succeed on the third try.

Using these assumptions, I can compute the average number of throws for each choice.

For choice 2, I will always use two throws to get the disc in the basket.

For choice 1, there are three possible outcomes:

  • outcome 1.1: I get it the basket on the first throw!
  • outcome 1.2: I miss the basket, but get the disc in the basket on the second throw.
  • outcome 1.3: I miss twice, and get it on the third throw.

 

The probabilities for each of those outcomes are:  $p$, $(1-p) q$, and $(1-p)(1-q)$ respectively.

Let $a$ be the average number of throws for choice 1. Then $$\begin{aligned}a &= p\cdot 1 +(1-p)q\cdot 2 + (1-p)(1-q)\cdot 3 \\&= p + 2 q – 2 p q + 3 – 3 p – 3 q + 3 p q\\&=3 -2 p – q + p q.\end{aligned}$$

 

I should choose choice 1 if $ 2>a$.  This occurs when

$$\begin{aligned} 2 &> 3 -2 p – q + p q\\-1 &> -2 p – q + p q \\-1 &> (q-2) p – q  \\q -1 &> (q-2) p\\ \frac{q -1}{q-2} &< p   \\\frac{1-q}{2-q} &< p. \\ \end{aligned}$$

Now you can plug in various values for $q$ to find the lowest value for $p$ needed to go for it.

Probability of Success          Required
 After Missing             Probability of Success
                             on the first try
     100%                           0%
      99%                           1%
      95%                           5%
      90%                           9%
      85%                          13%
      80%                          17%
      75%                          20%
      50%                          33%
      0%                           50%

So, if you are 100% certain that you will put it in the basket on the second try, then you should use choice 1 (going for it) if $p>0$ (i.e. always).

If you are 90% certain that you will put it in the basket on the second try, then you should use choice 1 (going for it) if $p>0.09=9\%$.

$$ $$

a rule of thumb

A nice approximate rule of thumb is to go for it if the sum of $p$ and $q$ is more than 100%.

When I am 6 yards from the basket, I will get it in about 75% of the time (p=0.75), and if I miss, I will usually get it in 90% of the time.  The sum of 70% and 90% is 160%, so obviously, I should go for it.

When I am 9 yards from the basket, I will get it in about 20% of the time (p=0.20), and if I miss, I will usually get it in 85% of the time.  The sum of 20% and 85% is 105%, so it seems like I should go for it.

If the basket is on a hill or if it is windy, then the disc will sometimes roll a fair distance if I miss.  In that case, $q$ might be only 75%. The rule of thumb is that $p+q$ should be at least 100% to go for it, so according to the rule of thumb, I would need $p$ to be at least 25% to go for it.  On a windy day, that’s like 6 yards for me.

Professor Peter Griffin discovered a nice theorem about the best way to count cards in Blackjack. (See the appendix in his book “Theory of Blackjack”). In this note, we review the theorem and reproduce his proof with more detail.

Suppose that you have a wagering game like blackjack which is played with a deck of cards. If you remove some cards from the deck, then the expected value of the game changes. Griffin found an easy formula for estimating the change in expectation caused by the removal of the cards. His formula depends on $E_j$ which is the change in expectation of the game when removing only the $j$th card from the deck.

Assume that the game never requires more than $k$ cards and the deck has $n$ cards in it. Obviously $n\geq k>0$. There are $m = {n \choose k}$ subsets of the deck $I_1, I_2, \ldots, I_m$ that contain $k$ cards each. Let $y_i$ be the expected value of the game played with the card subset $I_i$. We would like to estimate $y_i$ based on the cards in $I_i$.

We can create a linear estimate of $y_i$ by creating a vector $b=(b_1, b_2, \ldots, b_n)$ where $b_j$ is the “value” of the $j$th card. More specifically,
$$
y_i \approx \sum_{j\in I_i} b_j.
$$
Griffin found that the correct formula for $b_j$ is simply $$b_j = (\mu – (n-1) E_j)/k$$ where $\mu$ is the expected value of the game with a fresh deck. Using this value vector $b$ minimizes the squared error of the estimator.  This formula is remarkable both for its simplicity and the fact that $k$ only plays a small roll in the calculation.

Griffin’s Theorem Let the error function $e:R^m \rightarrow R$ be defined as

$$
e(b) = \sum_{i=1}^m \left( y_i – \sum_{j\in I_i} b_j \right)^2 .
$$
Then $e(b^*) = \min_{b\in R^m} e(b)$ is the unique global minimum of $e$ where
$$
b^*_j = (\mu – (n-1) E_j)/k
$$
and $\mu$ is the expected value of the game with a fresh deck.

In the theorem above, $e$ is the sum of the squared errors in the linear estimate of the expected value of the game. In order to prove the theorem, we need two lemmas.

 

Lemma 1 If $\tilde{y}_j$ is the average expectation of the $k$ card subsets that do not contain card $j$, $\bar{y}_j$ is the average expectation of the $k$ card subsets that contain card $j$, and $\mu$ is the expectation of the game with a fresh deck, then
$$\mu = \frac{k}{n}\; \bar{y}_j + \left(1- \frac{k}{n}\right)\tilde{y}_j$$
and
$$\bar{y}_j = \frac{n \mu – (n-k) \tilde{y}_j}{k}.$$

The short proof of this lemma is left to the reader.

Lemma 2
Suppose for $j = 1,\ldots, n$,
$$(2.1)\quad b_j + \alpha \sum_{i=1}^n b_i = \gamma_j.$$  Then

$$ b_j = \gamma_j – \frac{\alpha\; n\; \bar\gamma}{1 + n \alpha}$$
where $\bar\gamma = \frac1n \sum_{j=1}^n \gamma_j$.

Proof: Sum both sides of equation (2.1) to get
$$
n \bar{b} + \alpha n^2 \bar{b} = n \bar{\gamma}
$$
where $\bar{b} = \frac1n \sum_{j=1}^n b_j$. Then,
$$
\begin{aligned}
\bar{b} + \alpha n \bar{b} &= \bar{\gamma} \\
\bar{b} &= \frac{\bar{\gamma}}{1+ n \alpha}.
\end{aligned}
$$
Applying that to equation (2.1) yields
$$
\begin{aligned}
b_j + \alpha \sum_{j=1}^n b_j &= \gamma_j \\
b_j + \alpha n \bar{b} &= \gamma_j\\
b_j + \alpha n \frac{\bar{\gamma}}{1+n \alpha} &= \gamma_j\\
b_j &= \gamma_j – \frac{\alpha\; n\; \bar{\gamma}}{1+n \alpha} \\
\end{aligned}
$$
which completes the proof.

Now we can prove Griffin’s Theorem.

Proof:  Let the matrix $X$ be defined by $X_{ij} = 1$ if card $j$ is in set $I_i$ and $X_{ij}=0$ otherwise. We wish to minimize the sum of the squared errors. If we assume that the value of the $j$th card is $b_j$, then we can estimate the expectation of the game played using only card subset $I_i$ with
$$
\sum_{j\in I_i} b_j.
$$
The error of this estimate is $(y_i – \sum_{j\in I_i} b_j)$. The sum of the squared error is
$$
e(b) = \sum_{i=1}^m \left( y_i – \sum_{j\in I_i} b_j \right)^2.
$$
Noice that $\sum_{j\in I_i} b_j = x_i \cdot b$ where $x_i$ is the $ith$ row of $X$. So,
$$
e(b) = \sum_{i=1}^m \left( y_i – \sum_{j=1}^n X_{ij} b_j \right)^2 = \| X b – y \|^2.
$$
In other words $e(b)$ is the square of the distance between $y$ and $Xb$. The Gauss-Markov theorem provides a nice solution for the $b$ which minimizes this distance
$$
b^* = \left(X^T X\right)^{-1} X^T y
$$
where $X^T$ indicates the transpose of $X$. Hence,
$$
(1)\quad X^T X b^* = X^T y.
$$
Let $C=X^T X$. It turns out that $C$ has a very simple structure.

$C_{ij} = x_i^T x_j$ where $x_i$ and $x_j$ are the $i$th and $j$th columns of $X$, so
$$
(2) \quad C_{ii} = x_i^T x_i = { n-1 \choose k-1},
$$
and if $i \neq j$,
$$
(3)\quad C_{ij} = x_i^T x_j = { n-2 \choose k-2}.
$$

So, applying equations (2) and (3) to  $i$th row of matrix equation (1) gives

$${n-1 \choose k-1} b_i^* + {n-2 \choose k-2} \sum_{j\neq i} b_j^* = {n-1 \choose k-1} {\bar{y}_i}$$
for $j=1, 2, \ldots n$ where $\bar{y}_j$ is the average expectation of the ${n-1\choose k-1}$ subsets with $k$ cards that contain card $j$.

Note that $$ {n-2 \choose k-2} / {n-1 \choose k-1} = \frac{k-1}{n-1} ,$$ so
$$
\begin{aligned}
b_i^* + \frac{k-1}{n-1} \sum_{j\neq i}^n b_j^* &= \bar{y}_j \\
b_i^* – \frac{k-1}{n-1} b_i^* + \frac{k-1}{n-1} \sum_{j=1}^n b_j^* &=\bar{y}_j \\
\frac{n-k}{n-1} b_i^* + \frac{k-1}{n-1} \sum_{j=1}^n b_j^* &= \bar{y}_j \\
(n-k) b_i^* + (k-1) \sum_{j=1}^n b_j^* &= (n-1) \bar{y}_j\\
b_i^* + \frac{k-1}{n-k} \sum_{j=1}^n b_j^* &= \frac{n-1}{n-k}\bar{y}_j.
\end{aligned}
$$
We apply Lemma 2 with $\alpha = \frac{k-1}{n-k}$ and $\gamma_j = \frac{n-1}{n-k} \bar{y}_j$ to get
$$
\begin{aligned}
b_j^* &= \gamma_j – \frac{\alpha\; n\; \bar\gamma}{1 + n \alpha} \\
&= \frac{n-1}{n-k} \bar{y}_j\; – \frac{\frac{k-1}{n-k}\; n\; \bar\gamma}{1 + n\; \frac{k-1}{n-k}} \\
&= \frac{n-1}{n-k} \bar{y}_j\; – \frac{(k-1) \; n\; \bar\gamma}{n-k + n(k-1)} \\
&= \frac{n-1}{n-k} \bar{y}_j\; – \frac{(k-1) \; n\; \bar\gamma}{n-k + n k-n} \\
&= \frac{n-1}{n-k} \bar{y}_j\; – \frac{(k-1) \; n\; \bar\gamma}{-k + n k} \\
&= \frac{n-1}{n-k} \bar{y}_j\; – \frac{(k-1) \; n\; \bar\gamma}{k (n-1)} \\
&= \frac{n-1}{n-k} \bar{y}_j\; – \frac{(k-1) \; n\; \frac{n-1}{n-k} \mu }{k (n-1)} \\
&= \frac{n-1}{n-k} \bar{y}_j\; – \frac{(k-1) \; n\; \mu }{(n-k)k}. \\
\end{aligned}
$$
By Lemma 1,
$$\bar{y}_j = \frac{n \mu – (n-k) \tilde{y}_j}{k},$$
so
$$
\begin{aligned}
b_j^*
&= \frac{n \mu – (n-k) \tilde{y}_j}{k} \frac{n-1}{n-k} – \frac{ n (k-1)\mu}{k (n-k)} \\
&= \frac{ n (n-1) \mu }{(n-k)k} – \frac{ (n-k) \tilde{y}_j (n-1)}{k(n-k)} – \frac{ n (k-1)\mu}{k (n-k)} \\
&= \frac{ n (n-1) \mu }{(n-k)k} – \frac{ n (k-1)\mu}{k (n-k)} – \frac{ (n-k) \tilde{y}_j (n-1)}{k(n-k)} \\
&= \frac{n \mu}{k} \left[ \frac{n-1}{n-k} - \frac{k-1}{n-k} \right] – \frac{ \tilde{y}_j (n-1)}{k} \\
&= \frac{n \mu}{k} – \frac{ \tilde{y}_j (n-1)}{k} \\
&= \frac{n \mu – (n-1) \tilde{y}_j}{k} \\
&= \frac{\mu – (n-1) (\tilde{y}_j- \mu)}{k} \\
&= \frac{\mu – (n-1) E_j}{ k }
\end{aligned}
$$
which completes the proof.

In the future, I hope to write some articles about how Griffin’s Theorem can be applied to other games.

I should mention that it is not too hard to extend Griffin’s Theorem to get more accurate quadratic approximation of the expectation of the game with card removal (or higher degree polynomial approximations).

Suppose that you are playing the game Minesweeper.  On your first move, you click on the lower left corner square and reveal a 1.  Then you click on the square above the corner and reveal a 2.  Then you click on the square above that and reveal a 3.  What is the “safest” next move?

Screen Shot 2021-02-08 at 9.31.51 AM

In order to talk about the contents of the blue squares, we will label them A,B,C,D, and E.

Screen Shot 2021-02-08 at 9.35.35 AM

There are only three possible scenarios:

a) A, B, C, and E have mines,

b) A, C, and D have mines, or

c) B, C, and D have mines.

But, not all of these scenarios are equally likely.  Suppose there are a total of $m$ mines on the board and $s$ squares left excluding the eight that we are looking at. Then the total number of possible distributions of the mines for scenarios a, b, and c are:

  • $n_a = {s\choose m-4},$
  • $n_b= {s\choose m-3},$ and
  • $n_c ={s\choose m-3}.$

These scenarios are not equally likely.  (Above we used choose notation.  ${n\choose m}= \frac{n!}{m! (n-m)!}$ where $n!=1\cdot2\cdot3\cdot\cdots\cdot n$.  For example 4!=24 and ${5\choose 2}=\frac{5!}{2!\  \cdot\ 3!} = \frac{120}{2 \cdot 6}= 10$.)  In fact,

$$\begin{aligned} r=\frac{n_b}{n_a}&=\frac{s\choose m-3}{s\choose m-4} \\&=\frac{\frac{s!}{(m-3)! (s-(m-3))!}}{\frac{s!}{(m-4)! (s-(m-4))!}}\\&=\frac{\frac{1}{(m-3)! (s-(m-3))!}}{\frac{1}{(m-4)! (s-(m-4))!}}\\&= \frac{(m-4)! (s-(m-4))!}{(m-3)! (s-(m-3))!}\\&= \frac{ (s-m+4)!}{(m-3) (s-m+3))!}\\&= \frac{ s-m+4}{m-3 }.\end{aligned}$$

In the beginning of the game $r\approx s/m-1\approx 4$, so scenarios b and c are about four times as likely as scenario a.  We can now estimate the probabilities of scenarios a, b, and c to be about

  • “probability of scenario a” = $p_a \approx 1/9,$
  • “probability of scenario b” = $p_b \approx 4/9, and$
  • “probability of scenario c” = $p_c \approx 4/9.$

We can now conclude that the probability that square A has a mine is 5/9, that square B has a mine is 5/9, that square C has a mine is 100%, that square D has a mine is 8/9, and that square E has a mine is 1/9, so square E is the “safest” move.

Generally speaking, scenarios with more mines are less likely if less than half of the unknown squares have mines.

Another interesting way to think about it is that the 3 and 2 squares pulled the mines toward them making square E less likely to contain a mine.

You can approximate the probability of each scenario by just assuming that the squares are independent random variables (a false, but almost true assumption) each of which has probability $m/s$ of containing a mine.  Using that method gives the same results as the approximation above.

If you prefer an exact calculation, then use the formula

$$ r=\frac{ s-m+4}{m-3 }$$

to get the exact ratio of $\frac{p_b}{p_a} = \frac{p_c}{p_a}=r$.

 

(PS:  Jennifer told me via email that you can play Minesweeper online at https://www.solitaireparadise.com/games_list/minesweeper.htm)

 

 

 

 

So, I have played about 300 hours of Slay the Spire since I got it on July 26.  It’s a turn-based deck building game.  Many of these deck building games have interesting mathematics, so I have been devoting a fair amount of time to analyzing the game and writing about the game.

The most interesting theorem about the game is

D = h (( d – c b + w)/a + b/alpha)

where D is the total damage that you take, h is the total amount of damage that you give to the enemy, d is the average attack per turn from the enemy, c is the average number of cards you play per turn, b is the average block per blocking card played, w is the average amount of waisted block per turn, and alpha is the average attack for the attack cards you played.  (PDF slides here.)

The nice thing about the formula is that h, d, c, b, and alpha are often precisely known or easy to calculate.  Also, a large portion of the reasonable strategies have w=0.  If you know h,d,c,b, and w and they are constant, then the correct strategy is simple:  if (d-c b + w) is positive, then don’t block.  If it’s negative, then block a lot.

There are other analysis techniques that were mentioned in the Sept 27, 2020 reddit post “Simplified Spire Puzzles“.  My favorite is looking at the ratio of damage received to damage given.

Also, I wrote a computer program that computes the best possible strategy for just about any Slay Spire combat situation.  The drawback is that if you have over 20 cards and 10 different types of cards, the program needs about 10 terabytes of ram and 3 years of cpu time to computer the answer.  It is much quicker if you have only 10 cards the are all strike or block cards in which case it takes less than one cpu second and only a few kilobytes of ram.

I have been wondering how to present this information to the world.  Today, I showed the formula and my program to my friends Cat and Chuck who are both a) fans of Slay the Spire, and b) programmers.  Additionally, I created about 10 power point slides and a 16 page document mostly describing simplified Slay the Spire problems and their solutions.

Additionally, I would like to document all the card combinations that result in either an infinite sequence of actions (resulting from hyperbolic growth) or another kind of growth.  Growth in this game seems to be limited to quadratic (1 4 9 16 25…), cubic (1 8 27 64 125…), or exponential (1 2 4 8 16 32…).  I have never seen any other kind of growth in any game except dominion which  can have polynomial growth where the exponent is a fraction.

I don’t mind writing, but I do mind rewriting and combining my short essays into a larger, more useful work especially if no one is going to read it.

Cheers, Hein

 

   In March of 2016, the computer program AlphaGo defeated Lee Sedol, one of the top 10 Go players in the world, in a five game match.  Never before had a Go computer program beaten a professional Go player on the full size board.  In January of 2017, AlphaGo won 60 consecutive online Go games against many of the best Go players in the world using the online pseudonym Master.  During these games, AlphaGo (Master) played many non-traditional moves—moves that most professional Go players would have considered bad before AlphaGo appeared. These moves are changing the Go community as professional Go players adopt them into their play.

Michael Redmond, one of the highest ranked Go players in the world outside of Asia, reviews most of these games on You Tube.  I have played Go maybe 10 times in my life, but for some reason, I enjoy watching these videos and seeing how AlphGo is changing the way Go is played. Here are some links to the videos by Redmond.

Two Randomly Selected Games from the series of 60 AlphaGo games played in January 2017

 

Match 1 – Google DeepMind Challenge Match: Lee Sedol vs AlphaGo
https://www.youtube.com/watch?v=vFr3K2DORc8

 

The algorithms used by AlphaGo (Deep Learning, Monte Carlo Tree Search, and convolutional neural nets) are similar to the algorithms that I used at Penn State for autonomous vehicle path planning in a dynamic environment.  These algorithms are not specific to Go.  Deep Learning and Monte Carlo Tree Search can be used in any game.  Google Deep Mind has had a lot of success applying these algorithms to Atari video games where the computer learns strategy through self play.  Very similar algorithms created AlphaGo from self play and analysis of professional and amateur Go games.

I often wonder what we can learn about other board games from computers.  We will learn more about Go from AlphaGo in two weeks.  From May 23rd to 27th, AlphaGo will play against several top Go professionals at the “Future of Go Summit” conference.

Cheers,
Hein

Gettysburg University has a nice webpage on Counterfactual Regret. Counterfactual Regret was used by the University of Alberta to solve the heads up limit Holdem poker game. (See e.g. the pokernews.com  article).

The book “Deep Learning” by Ian Goodfellow, Yoshua Bengio, and Aaron Courville (associated with the Google Deep Mind Team) is available in HTML format.

http://www.deeplearningbook.org/

Wired has a nice article about the two most brilliant moves in the historic match between AlphaGo and Lee Sedol.

http://www.wired.com/2016/03/two-moves-alphago-lee-sedol-redefined-future/

In case you had not gotten the news yet, the Go playing program AlphaGo (developed by the Deep Mind division of Google) has beaten Lee Se-dol who is among the top two or three Go players in the world.  Follow the link below for an informative informal video describing AlphaGo and the victory.

http://www.theverge.com/2016/3/9/11184362/google-alphago-go-deepmind-result

Science magazine has a nice pregame report.

http://www.sciencemag.org/news/2016/03/update-why-week-s-man-versus-machine-go-match-doesn-t-matter-and-what-does

Hey, I just enrolled in Stanford’s General Game Playing CourseGeneral game playing programs are programs that try to play almost any game after being given the rules.  There is a yearly competition of general game playing programs at the AAAI conference.  If you join the course, send me an email so that we can exchange ideas or notes.  (my email address is hundal followed by “hh” at yahoo.com)

Cheers, Hein

 

« Older entries