In this notebook, we investigate some games. The rules of the first game are quite simple. You probably know this game: Each player selects secretly rock, paper or scissors, and both show their selection with hand signs at the same time. The win matrix for this game is the following.
Our selection is one of the columns, and the selection of the opponent is one of the rows. The outcomes are in the table, seen from our side. I.e., rock wins against scissors, scissors win against paper, and paper wins against rock (paper wraps the rock).
>shortformat; ... A := [0,1,-1;-1,0,1;1,-1,0]
0 1 -1 -1 0 1 1 -1 0
We develop a worst case tactics. So we select probabilities
for each item. With these we compute the minimal expected win under all possible selections of the opponent. This is the worst case for our selection of probabilities. We get
for all rows i, where h is the minimal win. The best strategy with the least worst case outcome maximizes h under the above conditions, and the additional condition
This can be translated into a simplex problem.
>M := A|-1; ... M := M_[1,1,1,0]
0 1 -1 -1 -1 0 1 -1 1 -1 0 -1 1 1 1 0
The variables are p1,p2,p3,h. The first three rows are the inequalities,
and the last one is the equality
The right hand side is 0 for the inequalities, and 1 for the equality.
>b := ones(rows(A),0)_1
0 0 0 1
The equation type is 1 (>=) or 0 (=).
>eq := [1,1,1,0]'
1 1 1 0
The target is to maximize h, the last variable.
>c := zeros(1,cols(A))|1
[ 0 0 0 1 ]
Now we can start the Simplex algorithm.
>simplex(M,b,c,eq=eq,restrict=[1,1,1,0],max=1,>check)
0.33333 0.33333 0.33333 0
The result is, that we have to select each item with the same probility. The worst case outcome is 0. I have put this all into the function game(A) contained in the file game.e.
>load game.e;
This function computes the optimal strategy p, and the worst case result.
>{p,res}=game(A); fracprint(p),
1/3 1/3 1/3
The worst case.
>res
0
There is a modification of the game, with an additional item "well", which wins against scissors and rock (fall into well). The matrix for this is the following.
>A1 := [0,1,-1,1;-1,0,1,-1;1,-1,0,1;-1,1,-1,0]
Column 1 to 3: 0 1 -1 -1 0 1 1 -1 0 -1 1 -1 Column 4: 1 -1 1 0
The old optimal probabilities can no longer be used, since the opponent can beat it by selecting well all the time. In fact, we get a different solution. We play paper, scissors, and well with equal probability.
>{p,res}=game(A1); fracprint(p)
0 1/3 1/3 1/3
The worst case is still 0. The game is fair. It must be fair, since it is absolutely symmetric. The game matrix for the opponent is -A'.
>res
0
The most simple Poker game goes like this: Both players get a random number from 1 to n. - Both players put 1$ at stake. - The first player has the choice to pass (loses 1$) or set another 1$. - The second player now has the choice to pass (loses 1$), or hold with 1$. In the latter case the player with the larger number gets the staked amount (wins 2$). With the same numbers, the outcome is 0.
>//
The first player has three tactics: 1. pass all the time 2. hold with 1, pass with 2 3. pass with 1, hold with 2 4. hold with 1 and 2 2. is clearly inferior to 4., so we do cancel it from the list. The tactics remaining can be labelled 1. -- 3. -+ 4. ++ The second player has a similar list. The outcome is somewhat difficult to compute, since it depends on the probablities of 1 and 2. E.g., if player 1 selects -+ and player 2 selects -+, player 1 will loose 1$ when he gets a 1, and win 1/2$ on average when he gets a 2, since the other player in this case will pass with 1 and hold with 2. Thus the expected win of tactics -+ against -+ is
for the first player. This is of course an expected loss of 1/4. E.g., let us compute -+ against ++. For 1, the first player gets -1$. For a 2 he wins 2$ with chance 1/2, yielding 1$ on average. The total average is 0$. We fill out the matrix carefully.
>B1 := [-1,0,1;-1,-1/4,0;-1,0,0]; fracprint(B1)
-1 0 1 -1 -1/4 0 -1 0 0
The following optimization says, that the first player has to select ++ all the time. If he doesn't the second player can punish him with -+.
>game(B1)
0 0 1
We can compute the optimal strategy of the second player using the transposed matrix with reverted outcomes. He can select -+ all the time. But it is clear that he can also select ++.
>game(-B1')
0 0 1
We can try to expand this method to more than 2 values. Let us assume the players get 1 to n randomly with equal probability 1/n. We consider only the tactic i: "Pass with less than i" for both players. E.g., i=1 means: "Pass never". One can see easily that this is the best tactic, which passes for i-1 card values. First we build a matrix, which shows the outcome, if player 1 selects j, and player 2 selects i, for each card value 1 to n of player 1, and 1 to n of player 2.
>function T (i,j,n) ...
in=1:n; jn=in'; H=2*(jn<in)-2*(jn>in); if j>1 then H[1:n,1:j-1]=-1; endif; if i>1 then H[1:i-1,j:n]=1; endif; return H endfunction
In the following example, player 2 does not pass with cards greater or equal 3, and player 1 does not pass with cards greater or equal 2. Here n=4. Consequently, player 1 will lose 1$ with a 1, and player 2 will lose 1$, if player 1 has 2 or more, and he has less than 3. In all other cases, we can just compare cards for a loss or win of 2$.
>T(3,2,4)
Column 1 to 3: -1 1 1 -1 1 1 -1 -2 0 -1 -2 -2 Column 4: 1 1 2 0
Now we can compute the expected value for strategies i and j.
>function map expect (i,j,n) := totalsum(T(i,j,n))/n^2
The matrix A(n) in the following function is our game matrix.
>function A(n) := expect((1:n)',1:n,n)
The case n=2 is the lower right 2x2 matrix of above, but with columns and rows inverted, since ++ is now i=1 resp. j=1.
>A(2)
0 0 0 -0.25
Here is the 5x5 case.
>A(5)
Column 1 to 3: 0 0.12 0.08 -0.12 -0.04 -0.04 -0.08 -0.12 -0.16 0.12 -0.04 -0.2 0.48 0.2 -0.08 Column 4 to 5: -0.12 -0.48 -0.2 -0.52 -0.28 -0.56 -0.36 -0.6 -0.36 -0.64
Now we can compute the optimal strategy for player 1.
>{p,res}=game(A(5)); fracprint(p')
[ 2/3 1/3 0 0 0 ]
This result shows that a random bluff is a good move. Bidding with card value 1 looks unnatural at first sight. But we have to do it with probability 2/3. The game is no longer fair. It is a loss for player 1.
>res
-0.0933333333333
If we forbid player 1 to bid with card 1, he loses more.
>M5=A(5); M5[:,1]=-1; >{p,res}=game(M5); fracprint(p'), res
[ 0 1 0 0 0 ] -0.12
Let us look at the strategy for player 2. It turns out, that this player has to be more reasonable.
>{p,res}=game(-A(5)'); fracprint(p'), res
[ 0 1/3 2/3 0 0 ] 0.0933333333333
Let us try to disallow holding with 1 or 2 for player 2. The result is worse.
>M5=-A(5)'; M5[:,1:2]=-1; >{p,res}=game(M5); fracprint(p'), res
[ 0 0 1 0 0 ] 0.08
The case n=100 takes about 5 seconds on my computer. This is due to the generation of the the matrix A(n), which could be improved using a more direct formula. The Simplex algorithm is rather quick.
>A100=A(100); {p,res}=game(A100);
The result is: Never pass in 66% of the games, and pass in 0.34% of the games, if you have less than 34. You will lose 0.11 on average.
>nonzeros(p), p[nonzeros(p)]', res
[ 1 34 ] [ 0.656565656566 0.343434343434 ] -0.111066666667
Let us check, if this is indeed the worst case for the strategy v.
>min((A100.p)')
-0.111066666667
The strategy for the second player can be summarized as to bid with 34 all the time.
>{p,res}=game(-A100'); >nonzeros(p), p[nonzeros(p)]', res
[ 34 35 ] [ 0.666666666667 0.333333333333 ] 0.111066666667
In the continuous case, both players get a number between 0 and 1 with equal distribution in [0,1]. From our experiments, we think that we should bid always in 2/3 of the games, and bid with 1/3 or better in the rest 1/3 of the games. We like to compute the worst case loss for this strategy. The equivalent of our matrix is a function
If the first player has strategy y ("pass with less than y"), and the second player strategy x ("pass with less than x"), then for y>x - the first player loses 1 if he gets less than y, - he wins 1, if he gets more than y, but the second player gets less than x - he wins 2, if he gets more than y, but the second player gets less than y Other cases balance out. Carefully computing the chances for y>x and y<=x leads to the following function.
>function f(x,y) ...
if y>x then return -y+(1-y)*x+2*(y-x)*(1-y) else return -y+(1-y)*x-2*(x-y)*(1-x) endfunction
>plot3d("f",a=0,b=1,c=0,d=1,angle=120°,contour=true):
The result is as expected. For our strategy, the worst case occurs for x=1/3.
>xopt := fmin("2/3*f(x,0)+1/3*f(x,1/3)",0,1)
0.333333336154
And it has the value -1/9.
>2/3*f(xopt,0)+1/3*f(xopt,1/3)
-0.111111111111
We claim that the optimal strategies are For the first player: - 2/3 "bid all the time", - 1/3 "bid with more than 1/3". For the second player: - always "bid with more than 1/3". We already checked the outcome of the strategy of the first player. It is an average loss of -1/9. Let us check the outcome for the second player.
>yopt := fmax("f(1/3,x)",0,1)
0.0896659076396
Indeed, the outcome is 1/9.
>-f(1/3,yopt)
0.111111111111
This proves that both strategies are optimal. The proof is rather obvious. But it is also a case of a duality relation in optimization. Let us simulate the game with both players playing the optimal strategy. The following function gets a random number for both players and a random number for the first player, so that he can select his strategy.
>function map onegame (x,y,xrand) ...
if xrand<1/3 and x<1/3 then return -1; endif if y<1/3 then return 1; endif; if x>y then return 2; else return -2; endif endfunction
The simulation should be close to -1/9 as expected.
>n=100000; sum(onegame(random(1,n),random(1,n),random(1,n))/n)
-0.10907