iconEuler Examples

Rock-Paper-Scissors and Poker Bluffs

In this notebook, we investigate some games. 

The rules of the first game are quite simple. You probably know this
game: Each player selects secretly rock, paper or scissors, and both
show their selection with hand signs at the same time. The win matrix
for this game is the following.

Rock-Paper-Scissors and Poker Bluff

Our selection is one of the columns, and the selection of the opponent
is one of the rows. The outcomes are in the table, seen from our side.

I.e., rock wins against scissors, scissors win against paper, and
paper wins against rock (paper wraps the rock).
>shortformat; ...
>A := [0,1,-1;-1,0,1;1,-1,0]
            0             1            -1 
           -1             0             1 
            1            -1             0 
We develop a worst case tactics. So we select probabilities

Rock-Paper-Scissors and Poker Bluff

for each item. With these we compute the minimal expected win under
all possible selections of the opponent. This is the worst case for
our selection of probabilities. We get

Rock-Paper-Scissors and Poker Bluff

for all rows i, where h is the minimal win.

The best strategy with the least worst case outcome maximizes h under
the above conditions, and the additional condition

Rock-Paper-Scissors and Poker Bluff

This can be translated into a simplex problem.
>M := A|-1; ...
>M := M_[1,1,1,0]
            0             1            -1            -1 
           -1             0             1            -1 
            1            -1             0            -1 
            1             1             1             0 
The variables are p1,p2,p3,h. The first three rows are the
inequalities, 

Rock-Paper-Scissors and Poker Bluff

and the last one is the equality

Rock-Paper-Scissors and Poker Bluff

The right hand side is 0 for the inequalities, and 1 for the equality.
>b := ones(rows(A),0)_1
            0 
            0 
            0 
            1 
The equation type is 1 (>=) or 0 (=).
>eq := [1,1,1,0]'
            1 
            1 
            1 
            0 
The target is to maximize h, the last variable.
>c := zeros(1,cols(A))|1
[ 0  0  0  1 ]
Now we can start the Simplex algorithm.
>{p,flag,r}=simplex(M,b,c,eq=eq,restrict=[1,1,1,0],max=1);
The result is, that we have to select each item with the same
probility. The worst case outcome is 0.
>p
     0.333333 
     0.333333 
     0.333333 
            0 
If the flag is 0, we have found an optimal strategy.
>flag
0
I have put this all into the function game(A).
>load game.e;
This function computes the optimal strategy p, and the worst case
result.
>{p,res}=game(A); fracprint(p),
      1/3 
      1/3 
      1/3 
The worst case.
>res
0

Rock-Paper-Scissors-Well

There is a modification of the game, with an additional item "well",
which wins against scissors and rock (fall into well). The matrix for
this is the following.
>A1 := [0,1,-1,1;-1,0,1,-1;1,-1,0,1;-1,1,-1,0]
            0             1            -1             1 
           -1             0             1            -1 
            1            -1             0             1 
           -1             1            -1             0 
The old optimal probabilities can no longer be used, since the
opponent can beat it by selecting well all the time.

In fact, we get a different solution. We play paper, scissors, and
well with equal probability.
>{p,res}=game(A1); fracprint(p)
        0 
      1/3 
      1/3 
      1/3 
The worst case is still 0. The game is fair.
>res
0

Simple Poker Game

The most simple Poker game goes like this:

Both players get a random number from 1 to n.

- Both players put 1$ at stake.
- The first player has the choice to pass (loses 1$) or set another
1$.
- The second player now has the choice to pass (loses 1$), or hold
with 1$.

In the latter case the player with the larger number gets the staked
amount (wins 2$). With the same numbers, the outcome is 0.
>//
The first player has three tactics:

1. pass all the time
2. hold with 1, pass with 2
3. pass with 1, hold with 2
4. hold with 1 and 2

2. is clearly inferior to 4., so we do cancel it from the list. The
tactics remaining can be labelled

1. --
3. -+
4. ++

The second player has a similar list. The outcome is somewhat
difficult to compute, since it depends on the probablities of 1 and 2.

E.g., if player 1 selects -+ and player 2 selects -+, player 1 will
loose 1$ when he gets a 1, and win 1/2$ on average when he gets a 2,
since the other player in this case will pass with 1 and hold with 2.
Thus the expected win of tactics -+ against -+ is

Rock-Paper-Scissors and Poker Bluff

for the first player. This is of course an expected loss of 1/4.

E.g., let us compute -+ against ++. For 1, the first player gets -1$.
For a 2 he wins 2$ with chance 1/2, yielding 1$ on average. The total
average is 0$.

We fill out the matrix carefully.

Rock-Paper-Scissors and Poker Bluff

>B1 := [-1,0,1;-1,-1/4,0;-1,0,0]; fracprint(B1)
       -1         0         1 
       -1      -1/4         0 
       -1         0         0 
The following optimization says, that the first player has to select
++ all the time. If he doesn't the second player can punish him with
-+.
>game(B1)
            0 
            0 
            1 
We can compute the optimal strategy of the second player using the
transposed matrix with reverted outcomes. He can select -+ all the
time. But it is clear that he can also select ++.
>game(-B1')
            0 
            1 
            0 

More Card Values

We can try to expand this method to more than 2 values. Let us assume
the players get 1 to n randomly with equal probability 1/n.

We consider only the tactic i: "Pass with less than i" for both
players. E.g., i=1 means: "Pass never". One can see easily that this
is the best tactic, which passes for i-1 card values.

First we build a matrix, which shows the outcome, if player 1 selects
j, and player 2 selects i, for each card value 1 to n of player 1, and
1 to n of player 2.
>function T (i,j,n) ...
  in=1:n; jn=in';
  H=2*(jn<in)-2*(jn>in);
  if j>1 then H[1:n,1:j-1]=-1; endif;
  if i>1 then H[1:i-1,j:n]=1; endif;
  return H
endfunction
In the following example, player 2 does not pass with cards greater or
equal 3, and player 1 does not pass with cards greater or equal 2.
Here n=4.

Consequently, player 1 will lose 1$ with a 1, and player 2 will lose
1$, if player 1 has 2 or more, and he has less than 3. In all other
cases, we can just compare cards for a loss or win of 2$.
>T(3,2,4)
           -1             1             1             1 
           -1             1             1             1 
           -1            -2             0             2 
           -1            -2            -2             0 
Now we can compute the expected value for strategies i and j.
>function map expect (i,j,n) := totalsum(T(i,j,n))/n^2
The matrix A(n) in the following function is our game matrix.
>function A(n) := expect((1:n)',1:n,n)
The case n=2 is the lower right 2x2 matrix of above, but with columns
and rows inverted, since ++ is now i=1 resp. j=1.
>A(2)
            0             0 
            0         -0.25 
Here is the 5x5 case.
>A(5)
            0          0.12          0.08         -0.12         -0.48 
        -0.12         -0.04         -0.04          -0.2         -0.52 
        -0.08         -0.12         -0.16         -0.28         -0.56 
         0.12         -0.04          -0.2         -0.36          -0.6 
         0.48           0.2         -0.08         -0.36         -0.64 
Now we can compute the optimal strategy for player 1.
>{p,res}=game(A(5)); fracprint(p')
[ 2/3  1/3  0  0  0 ]
This result shows that a random bluff is a good move. Bidding with
card value 1 looks unnatural at first sight. But we have to do it with
probability 2/3.

The game is no longer fair. It is a loss for player 1.
>res
-0.0933333
If we forbid player 1 to bid with card 1, he loses more.
>M5=A(5); M5[:,1]=-1;
>{p,res}=game(M5); fracprint(p'), res
[ 0  1  0  0  0 ]
-0.12
Let us look at the strategy for player 2. It turns out, that this
player has to be more reasonable.
>{p,res}=game(-A(5)'); fracprint(p'), res
[ 0  1/3  2/3  0  0 ]
0.0933333
Let us try to disallow holding with 1 or 2 for player 2. The result is
worse.
>M5=-A(5)'; M5[:,1:2]=-1;
>{p,res}=game(M5); fracprint(p'), res
[ 0  0  1  0  0 ]
0.08

A large Example

The case n=100 takes about 5 seconds on my computer. This is due to
the generation of the the matrix A(n), which could be improved using a
more direct formula. The Simplex algorithm is rather quick.
>A100=A(100); {p,res}=game(A100);
The result is: Never pass in 66% of the games, and pass in 0.34% of
the games, if you have less than 34. You will lose 0.11 on average.
>nonzeros(p), p[nonzeros(p)]', res
[ 1  34 ]
[ 0.656566  0.343434 ]
-0.111067
Let us check, if this is indeed the worst case for the strategy v.
>min((A100.p)')
-0.111067
The strategy for the second player can be summarized as to bid with 34
all the time.
>{p,res}=game(-A100');
>nonzeros(p), p[nonzeros(p)]', res
[ 34  35 ]
[ 0.666667  0.333333 ]
0.111067

The continuous case

In the continuous case, both players get a number between 0 and 1 with
equal distribution in [0,1]. From our experiments, we think, that we
should bid always in 2/3 of the games, and bid with 1/3 or better in
the rest 1/3 of the games.

We like to compute the worst case loss for this strategy. The
equivalent of our matrix is a function

Rock-Paper-Scissors and Poker Bluff

If the first player has strategy y ("pass with less than y"), and the
second player strategy x ("pass with less than x"), then for y>x

- the first player loses 1 if he gets less than y,
- he wins 1, if he gets more than y, but the second player gets less
than x
- he wins 2, if he gets more than y, but the second player gets less
than y

Other cases balance out. Carefully computing the chances for y>x and
y<=x leads to the following function.
>function f(x,y) ...
  if y>x then return -y+(1-y)*x+2*(y-x)*(1-y)
  else return -y+(1-y)*x-2*(x-y)*(1-x)
endfunction
>plot3d("f",a=0,b=1,c=0,d=1,angle=120°,contour=true); insimg;

Rock-Paper-Scissors and Poker Bluff

The result is as expected. For our strategy, the worst case occurs for
x=1/3.
>xopt := fmin("2/3*f(x,0)+1/3*f(x,1/3)",0,1)
0.333333
And it has the value -1/9.
>2/3*f(xopt,0)+1/3*f(xopt,1/3)
-0.111111
We claim that the optimal strategies are

For the first player: 

- 2/3 "bid all the time", 
- 1/3 "bid with more than 1/3".

For the second player:

- always "bid with more than 1/3".

We already checked the outcome of the strategy of first player. It is
an average loss of -1/9.

Let us check the outcome for the second player.
>yopt := fmax("f(1/3,x)",0,1)
0.0896659
Indeed, the outcome is 1/9. 
>-f(1/3,yopt)
0.111111
This proves that both strategies are optimal. The proof is rather
obvious. But it is also a case of a duality relation in optimization.

Let us simulate the game with both players playing the optimal
strategy. The following function gets a random number for both players
and a random number for the first player, so that he can select his
strategy.
>function map onegame (x,y,xrand) ...
if xrand<1/3 and x<1/3 then return -1; endif
if y<1/3 then return 1; endif;
if x>y then return 2;
  else return -2;
endif
endfunction
The simulation should be close to -1/9 as expected.
>n=100000; sum(onegame(random(1,n),random(1,n),random(1,n))/n)
-0.11426
>

Examples Homepage