This notebook explains how to use Scientific Python to solve the Gambler's Ruin problem, as an example for an absorbing Markov chain.
We start by importing pylab
, the Scientific Python extensions. Then we define the transition matrix $P$ as in the lecture notes.
from pylab import *
P = array([[ 1, 0, 0, 0, 0, 0],
[ 0, 1, 0, 0, 0, 0],
[.6, 0, 0,.4, 0, 0],
[ 0, 0,.6, 0,.4, 0],
[ 0, 0, 0,.6, 0,.4],
[ 0,.4, 0, 0,.6, 0]])
The brute-force way to find the absorbing probabilities is to simply raise the matrix $P$ to a high power, e.g., $P^{100}$. We find (rounded to 4 digits after the decimal point):
matrix_power(P,100).round(4)
We see that in the long run, we transition into an absorbing state (first two columns) with probability 1 and remain in a non-absorbing state with probability 0.
Let us know solve this problem using a direct linear algebra approach. Writing $$ P = \begin{pmatrix} I & 0 \\ R & Q \end{pmatrix} , $$ where the dimension of $I$ corresponds to the number of absorbing states and the dimension of $Q$ to the number of non-absorbing states, and the dimensions of $R$ and $0$ made to fit, we know (see lecture notes) that the matrix of absorbing probabilities is given by $$ A = N R $$ where $$ N = (I-Q)^{-1} $$ is the so-called fundamental matrix. (The entries of the fundamental matrix are the expected number of times that the chain is in the state of the column index when the initial state is given by the row index.)
From our given matrix $P$, we pick the respective submatrices as follows:
R = P[2:,:2];R
Q = P[2:,2:];Q
Now compute $N=(I-Q)^{-1}$:
N = inv(eye(4)-Q);N
Thus, the expected number of times that the chain is in any non-absorbing state is the sum of the elements in each row, which can be computed by right-multiplication with the vector whose elements are all $1$:
N @ ones(4)
E.g., if the gambler starts with $\$3000$, he will play on average $6$ rounds of Black-Jack.
Now, let's compute the matrix of absorbing probabilities $A=NR$:
A = N @ R;A
$A$ really is a probability matrix: Its row-sum is always equal to $1$:
A@ones(2)
We finally compute the expected payoff from the Gambler's Ruin chain. First, we need the payoffs $B$ corresponding to the absorbing probabilities $A$.
B = array([[-1000,4000],[-2000,3000],[-3000,2000],[-4000,1000]])
Then the expected payoff corresponding to initial state $i$ is given by
$$
\mathbb{E}[P_i] = \sum_{j=1}^{2} a_{ij} \, b_{ij}.
$$
In Python, this can be expressed as follows (note that *
denotes element-wise multiplication and @
denotes matrix multiplication):
(A*B)@ones(2)
Quite clearly, the house always wins...