










Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Cheat sheet game theory very nice
Typology: Cheat Sheet
1 / 18
This page cannot be seen from the preview
Don't miss anything!
Before talking about games we should get some idea of how to represent them mathematically. We will focus on a certain class of games, called perfect information games, in which each player knows all moves that have taken place and the possible moves each player has at all times.
We’ll start off with some definitions:
We define a normal form (stragetic form) game as follows:
Let’s look at an example and get some practice working with the notation.
First we’ll consider a simple problem we can model as a normal form game. Consider the following situation: you want to meet three of your friends somewhere on MIT’s campus to go hacking. Each of you might go to one of four places: State Center (C), the Green Building (G), the dome (D), or Kresge (K). Now that we have a “game,” we’ll analyze it using the definition above:
Players
To make things easier, we’ll name you and your friends, and call the set of players N = {You, Stavros, Helen, Paris}
Stategy Sets
In this situation, each player’s strategy set, SY ou, SStavros, SHelen, SP aris, is the same: (C, G, D, K). It consists of the four possible meeting places.
Outcomes
Above we definited an “outcome” to be some s ∈ S, where S = SY ou × SStavros × SHelen × SP aris. First we’ll quickly review some mathematical no- tation. Also see the cheat sheet at the end of these notes that might be useful.
The symbol × when used with sets is called a Cartesian Product. It means to take all ordered tuples consisting of one element from each set in the product. So for example if B = { 1 , 2 } and C = { 3 , 4 }, then A = B × C consists of the tuples (1, 3), (1, 4), (2, 3), and (2, 4). So the first item in each element of A comes from the set B, and the second comes from the set C.
Now we are ready to definte the outcomes, S = SY ou × SStavros × SHelen × SP aris, in our game. The elements in S are: (D, D, D, D) (K, K, K, K) (C, C, C, C) (G, G, G, G) (G, D, D, D) etc., including all 4^4 combintations and possible orderings of K, C, D, and G.
BRi(si) = {si ∈ Si : ui(s′ i, s−i) ≤ ui(si, s−i)∀s′ i ∈ Si}
Or, in English, the best response function gives the strategy si for player i from his possibly strategy set Si, given that he knows all the other players’ strategies in s−i, that maximizes player i’s utility function.
Now we will look at perhaps one of the most famous game theory model problems, the prisoner’s dilemma. Two suspects are arrested as suspects in a crime investigation. For our sake, let’s say someone has stolen Pandora’s box, and both Helen and Stavros are accused and thrown into jail. If both of them don’t confess, they will both spend three years in prison. If only one of them confesses, he or she will be freed and used as a witness against the other, who will then have to spend four years in jail. If both confess, the judge will go easy on them and give them both sentences of only one year. This situation can be modeled with the table below:
Don’t Confess Confess Don’t Confess 3,3 0, Confess 4,0 1,
If the players cooperate and both confess, then they can both get off easy with only one year in prison. But after investigating things further it is clear that no rational player would confess. No matter whether one prisoner chooses to confess or not confess, the other is always better off not confessing. If Stavros confesses, then Helen gets 0 years in prison if she doesn’t confess versus 1 year in prison if she confesses. If Stavros doesn’t confess, then Helen gets 3 years in prison if she doesn’t confess versus 4 if she does. Therefore, if both players are rational, and for the moment we are assuming Stavros and Helen are, even though they were involved in that whole silly Trojan horse business back in the day and then had the foolishness to play with Pandora’s box, both will choose not to confess and therefore will end up with three year sentences.
Game theoriests talk about this relationship among strategies in terms of dominance. We say that a strategy s′ i is strongly dominated by another si if: ∀s−i ∈ Si, ui(si, s−i) > ui(s′ i, s−i)
That is, for any combinations of actions for all of the other players, the move si has a higher payoff than the action s′ i. A strategy is weakly dominated if : ∀s−i ∈ Si, ui(si, s−i) ≥ ui(s′ i, s−i)
and for at least one choice of s−i the inequality is strict, that is, the ≥ turns into a >. In games where there are more than two actions, it is also possible to have one action dominated not by a single other action, but by a combi- nation of others.
For Stavros and Helen, confessing is strongly dominated, and so neither player has any reason to confess. Keep these games and definitions in mind as they will be useful to us in the next section.
Another way to represent a game is in extensive form. This type of game notation represents games using game trees, which is how we will represent most of the two-player strategy games we look at. A game tree is a directed graph with nodes being game board positions and edges being moves. Each “level” of the tree, representing a move by one player, is called a ply. The leaves of the tree are possible game outcomes. Each vertex, called a deci- sion node, is labeled with the player that makes a decision at that point. A decision node represents every possible state of a game as it is played, starting from the initial node and ending with the terminal node which assigns final payoffs to the players.
So an extensive form game is represented with the following items:
Nash Equilibrium
Now that we have learned how to define games, we will begin looking at how to “solve” games. The most commonly used concept in looking at game solutions is the Nash Equilibrium (NE). The NE is conceptually the state of the game where if each player was given the chance to change strategies given all of the other players’ strategies, the player would not be able to profitably deviate from its current strategy. The first type of NE we examine assumes pure strategy games, where every player chooses one strategy and sticks with it.
The NE for a normal form game 〈N, (Si), (i)〉 is a strategy profile s∗^ ∈ S of actions such that for each player i ∈ N we have:
(s∗−i, si) i (s∗−i, s∗ i )∀si ∈ Si
So every player prefers the strategy profile s∗^ to any other strategy, given all the other players’ strategies s∗−i.
We can also define the Nash equilibrium in terms of best response func- tions. If BRi(s−i) is player i’s best response to a strategy profile s−i, then the Nash Equilibrium is a strategy profile s∗^ for which:
s∗ i ∈ BRi(s∗−i)∀i ∈ N
So a game could be solved by finding the best response function of each player, and then finding a profile s∗^ that is in each BRi. This can be mod- eled as a set of N equations with N unknowns and solved as a linear program.
Let’s find the NE for the games we have examined so far. In the original problem meeting your friends to go hacking, can you find the equilibrium strategy profile? Well, there are several: (D, D, D, D); (C, C, C, C); (G, G, G, G); (K, K, K, K). As long as you all meet at the same place, no player can profitably deviate (i.e., increase the value of their utility function) by choosing a different strat- egy.
Consider the following game, called matching pennies: Stavros and Helen flip one coin each. If the coins match, Helen has to give Stavros a dollar, but if the coins differ, Helen gets to take Stavros’s dollar. We can represent this game in a table:
Helen is the “column player” and Stavros is the “row player.” So when Helen flips tails and Stavros flips heads, this corresponds to the entry in the second row of the first column, or (-1,1). Meaning the row player, Stavros, loses a dollar while the column player, Helen, gains a dollar.
This type of game, in which what one player loses is exactly what the other player gains, is called a zero sum game. Another name is constant sum. You can think of it as the entire sum of money between Stavros and Helen staying the same the whole time, but the amount of money each individual player has changes with game play.
What is the NE for matching pennies? (Let’s assume for a second that each player chooses whether or not the pennies are heads or tails.) In this game, there is no NE. If Helen picks heads, Stavros will pick heads so the coins are the same and he wins. But given this strategy profile, Helen would pick tails so that the coins differ. There is no strategy profile in which each player would not want to deviate. This is an example of a pure strategy game in which the players have diametrically opposed goals, and there is no NE. The same situation happens in rock paper scissors. Can you see why?
Now look at prisoner’s dilemma. We said earlier that no matter what another player chooses, a player always does better by choosing not to confess. So this game has a unique NE, which is (Don’t Confess, Don’t Confess).
We have seen examples of games using pure strategy that have no NE, more than one NE, and one unique NE. It can be shown that if a player has a finite number of strategies to choose from and can choose different strategies given by a probability distribution, that an NE does exist. As a quick example,
Solving a Game
Many board games with seemingly simple rules have long stumped both hu- mans and computers. Solving games depends on the decision complexity, or the difficulty in making decisions, and the space complexity, or the size of the search space over the possible moves and their consequences.
Good players look ahead to the possible game situations a move might lead to, which is no simple task when the search space reaches billions of billions of possibilities! There is much research in the field of artificial intelligence in creating computers that are able to look many branches ahead in a game tree to determine the best move.
There are three basic types of “solutions” to games:
By perfect play, we mean each player cannot do any better, given the other’s strategy. So to “solve” a game means to find it’s Nash equilibrium.
Iterated Dominance
The method of iterated dominance is one way to find the pure strategy NE (if it exists) for a normal form game.
Consider a game with the following payoffs (player 1 is the row player, player 2 is the column player):
A B C D A (5,2) (2,6) (1,4) (0,4) B (0,0) (3,2) (2,1) (1,1) C (7,0) (2,2) (1,5) (5,1) D (9,5) (1,3) (0,2) (4,8)
Now when player 1 chooses a move, it will not choose A, since it is weakly dominated by C. But player 2 knows player 1 is rational, and knows he will not choose A. So the choices left are:
A B C D A B (0,0) (3,2) (2,1) (1,1) C (7,0) (2,2) (1,5) (5,1) D (9,5) (1,3) (0,2) (4,8)
Now for player 2, D is dominates A, so player 2 won’t choose A.
A B C D A B (3,2) (2,1) (1,1) C (2,2) (1,5) (5,1) D (1,3) (0,2) (4,8)
Now for player 1, B dominates both C and D, so he won’t play either of those:
A B C D A B (3,2) (2,1) (1,1) C D
Player 2 has B dominating both C and D:
Exercises
continue the game, or stop. A player gets a higher payoff with stopping at turn t compared to the other player stopping the game at time t + 1, but gets an even higher payoff if neither of these events happen (i.e., the game continues).
Appendix
Complexity
We will often form questions about solving games in terms of some n, usu- ally the size of a game board or a similar measure, and want to know how the number of steps needed to find the solution grows with n. It is worth mentioning a couple things to note about complexity here that we will refer to later on.
Problems are often classified using the following terms:
[[wC wN wB wQ wK wB wN wC] [wP wP wP wP wP wP wP wp] [e e e e e e e e] [e e e e e e e e] [e e e e e e e e] [e e e e e e e e] [bP bP bP bP bP bP bP bP] [bC bN bB bW bK bB bN bC]
It is also common to use numbers to represent pieces: 0 for empty, 1 for white pawn, -1 for black pawn, 2 for white castle, -2 for black castle, etc.
A Hex Board (we will meet the game of Hex next week) might be repre- sented as an 11 by 11 array:
with entries 0 for empty, 1 for black, and -1 for white.
Bit Boards
One interesting scheme used widely in computer chess is the bit board. This representation uses a 64 bit string for each piece, with a 1 if that piece is in the square and a 0 if not. So the bit board for white pawns in the initial chess game state is:
[[0 0 0 0 0 0 0 0] [1 1 1 1 1 1 1 1] [0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0]]
As one example of how this notation can be useful, white can compute all the squares occupied by black by bitwise ORing the bitboards for each of black’s pieces, etc. More resources on this method and a great tutorial applying it to checkers can be found online at http://www.3dkingdoms.com/checkers/bitboards.htm.
Graphs
While not used that widely, it might make sense to represent some game boards on a graph. A good graphing library for python is networkx, but there are plenty out there. You can create a graph by connect adjacent cells with edges and name them by their position on the board. We’ll talk more about graph theory next week.
There will be a full tutorial online about making game boards.
Symbols