Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Cheat sheet game theory, Cheat Sheet of Computer Science

Cheat sheet game theory very nice

Typology: Cheat Sheet

2022/2023

Uploaded on 11/11/2023

aditya-kishore-sinha
aditya-kishore-sinha 🇮🇳

1 document

1 / 18

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Introduction to Game Theory
Game Representations and Nash Equilibrium
February 10, 2009
Representing Games
Before talking about games we should get some idea of how to represent them
mathematically. We will focus on a certain class of games, called perfect
information games, in which each player knows all moves that have taken
place and the possible moves each player has at all times.
We’ll start off with some definitions:
Normal Form Game
We define a normal form (stragetic form) game as follows:
A finite set of players N={1,2, ...n}
Each player iNchooses actions from a strategy set Si
Outcomes are defined by strategy profiles which consist of all the
strategies chosen by individual players. Mathematically, the set of
strategy profiles are defined as S=S1×S2×...×Sn
Players have preferences over the outcomes (note, NOT preferences over
their actions), denoted as i. Usually these preferences are defined by
autility function,ui:S R that maps each outcome to a real
number. (This is sometimes called a payoff function).
So an instance of a game might be represented by the tuple (N, (Si),(i)),
containing the players, strategy sets, and preferences.
Let’s look at an example and get some practice working with the notation.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12

Partial preview of the text

Download Cheat sheet game theory and more Cheat Sheet Computer Science in PDF only on Docsity!

Introduction to Game Theory

Game Representations and Nash Equilibrium

February 10, 2009

Representing Games

Before talking about games we should get some idea of how to represent them mathematically. We will focus on a certain class of games, called perfect information games, in which each player knows all moves that have taken place and the possible moves each player has at all times.

We’ll start off with some definitions:

Normal Form Game

We define a normal form (stragetic form) game as follows:

  • A finite set of players N = { 1 , 2 , ...n}
  • Each player i ∈ N chooses actions from a strategy set Si
  • Outcomes are defined by strategy profiles which consist of all the strategies chosen by individual players. Mathematically, the set of strategy profiles are defined as S = S 1 × S 2 ×... × Sn
  • Players have preferences over the outcomes (note, NOT preferences over their actions), denoted as i. Usually these preferences are defined by a utility function, ui : S → R that maps each outcome to a real number. (This is sometimes called a payoff function). So an instance of a game might be represented by the tuple (N, (Si), (i)), containing the players, strategy sets, and preferences.

Let’s look at an example and get some practice working with the notation.

Example

First we’ll consider a simple problem we can model as a normal form game. Consider the following situation: you want to meet three of your friends somewhere on MIT’s campus to go hacking. Each of you might go to one of four places: State Center (C), the Green Building (G), the dome (D), or Kresge (K). Now that we have a “game,” we’ll analyze it using the definition above:

Players

To make things easier, we’ll name you and your friends, and call the set of players N = {You, Stavros, Helen, Paris}

Stategy Sets

In this situation, each player’s strategy set, SY ou, SStavros, SHelen, SP aris, is the same: (C, G, D, K). It consists of the four possible meeting places.

Outcomes

Above we definited an “outcome” to be some s ∈ S, where S = SY ou × SStavros × SHelen × SP aris. First we’ll quickly review some mathematical no- tation. Also see the cheat sheet at the end of these notes that might be useful.

The symbol × when used with sets is called a Cartesian Product. It means to take all ordered tuples consisting of one element from each set in the product. So for example if B = { 1 , 2 } and C = { 3 , 4 }, then A = B × C consists of the tuples (1, 3), (1, 4), (2, 3), and (2, 4). So the first item in each element of A comes from the set B, and the second comes from the set C.

Now we are ready to definte the outcomes, S = SY ou × SStavros × SHelen × SP aris, in our game. The elements in S are: (D, D, D, D) (K, K, K, K) (C, C, C, C) (G, G, G, G) (G, D, D, D) etc., including all 4^4 combintations and possible orderings of K, C, D, and G.

  • We define s−i to be the set of strategy choices of all players except i. So if Stavros has chosen G, Helen C, and Paris D, then s−Y ou = (G, C, D).
  • The best response function for a player i ∈ N is defined as

BRi(si) = {si ∈ Si : ui(s′ i, s−i) ≤ ui(si, s−i)∀s′ i ∈ Si}

Or, in English, the best response function gives the strategy si for player i from his possibly strategy set Si, given that he knows all the other players’ strategies in s−i, that maximizes player i’s utility function.

Now we will look at perhaps one of the most famous game theory model problems, the prisoner’s dilemma. Two suspects are arrested as suspects in a crime investigation. For our sake, let’s say someone has stolen Pandora’s box, and both Helen and Stavros are accused and thrown into jail. If both of them don’t confess, they will both spend three years in prison. If only one of them confesses, he or she will be freed and used as a witness against the other, who will then have to spend four years in jail. If both confess, the judge will go easy on them and give them both sentences of only one year. This situation can be modeled with the table below:

Don’t Confess Confess Don’t Confess 3,3 0, Confess 4,0 1,

If the players cooperate and both confess, then they can both get off easy with only one year in prison. But after investigating things further it is clear that no rational player would confess. No matter whether one prisoner chooses to confess or not confess, the other is always better off not confessing. If Stavros confesses, then Helen gets 0 years in prison if she doesn’t confess versus 1 year in prison if she confesses. If Stavros doesn’t confess, then Helen gets 3 years in prison if she doesn’t confess versus 4 if she does. Therefore, if both players are rational, and for the moment we are assuming Stavros and Helen are, even though they were involved in that whole silly Trojan horse business back in the day and then had the foolishness to play with Pandora’s box, both will choose not to confess and therefore will end up with three year sentences.

Game theoriests talk about this relationship among strategies in terms of dominance. We say that a strategy s′ i is strongly dominated by another si if: ∀s−i ∈ Si, ui(si, s−i) > ui(s′ i, s−i)

That is, for any combinations of actions for all of the other players, the move si has a higher payoff than the action s′ i. A strategy is weakly dominated if : ∀s−i ∈ Si, ui(si, s−i) ≥ ui(s′ i, s−i)

and for at least one choice of s−i the inequality is strict, that is, the ≥ turns into a >. In games where there are more than two actions, it is also possible to have one action dominated not by a single other action, but by a combi- nation of others.

For Stavros and Helen, confessing is strongly dominated, and so neither player has any reason to confess. Keep these games and definitions in mind as they will be useful to us in the next section.

Extensive Form

Another way to represent a game is in extensive form. This type of game notation represents games using game trees, which is how we will represent most of the two-player strategy games we look at. A game tree is a directed graph with nodes being game board positions and edges being moves. Each “level” of the tree, representing a move by one player, is called a ply. The leaves of the tree are possible game outcomes. Each vertex, called a deci- sion node, is labeled with the player that makes a decision at that point. A decision node represents every possible state of a game as it is played, starting from the initial node and ending with the terminal node which assigns final payoffs to the players.

So an extensive form game is represented with the following items:

  • players N = { 1 , 2 , 3 ,... , n}
  • a set H of sequences (the histories) of the game, where no sequence is a subhistory of another sequence, defined by: - ∅ is a member - if (sk)k=1...K ∈ H and L < K, then (sk)k=1...L ∈ H, where sk^ is an action after step k of the game. - if an infinite sequence (sk)∞ k=1 has (sk)k=1...L ∈ H for all integers L, then (sk) ∈ H. The set of histories in H that are infinite or ter- minal (there is no sk+1^ : (skk=1...k) ∈ H) is denoted as Z.

Nash Equilibrium

Now that we have learned how to define games, we will begin looking at how to “solve” games. The most commonly used concept in looking at game solutions is the Nash Equilibrium (NE). The NE is conceptually the state of the game where if each player was given the chance to change strategies given all of the other players’ strategies, the player would not be able to profitably deviate from its current strategy. The first type of NE we examine assumes pure strategy games, where every player chooses one strategy and sticks with it.

Normal Form NE

The NE for a normal form game 〈N, (Si), (i)〉 is a strategy profile s∗^ ∈ S of actions such that for each player i ∈ N we have:

(s∗−i, si) i (s∗−i, s∗ i )∀si ∈ Si

So every player prefers the strategy profile s∗^ to any other strategy, given all the other players’ strategies s∗−i.

We can also define the Nash equilibrium in terms of best response func- tions. If BRi(s−i) is player i’s best response to a strategy profile s−i, then the Nash Equilibrium is a strategy profile s∗^ for which:

s∗ i ∈ BRi(s∗−i)∀i ∈ N

So a game could be solved by finding the best response function of each player, and then finding a profile s∗^ that is in each BRi. This can be mod- eled as a set of N equations with N unknowns and solved as a linear program.

Let’s find the NE for the games we have examined so far. In the original problem meeting your friends to go hacking, can you find the equilibrium strategy profile? Well, there are several: (D, D, D, D); (C, C, C, C); (G, G, G, G); (K, K, K, K). As long as you all meet at the same place, no player can profitably deviate (i.e., increase the value of their utility function) by choosing a different strat- egy.

Consider the following game, called matching pennies: Stavros and Helen flip one coin each. If the coins match, Helen has to give Stavros a dollar, but if the coins differ, Helen gets to take Stavros’s dollar. We can represent this game in a table:

T H

T 1,-1 -1,

H -1,1 1,-

Helen is the “column player” and Stavros is the “row player.” So when Helen flips tails and Stavros flips heads, this corresponds to the entry in the second row of the first column, or (-1,1). Meaning the row player, Stavros, loses a dollar while the column player, Helen, gains a dollar.

This type of game, in which what one player loses is exactly what the other player gains, is called a zero sum game. Another name is constant sum. You can think of it as the entire sum of money between Stavros and Helen staying the same the whole time, but the amount of money each individual player has changes with game play.

What is the NE for matching pennies? (Let’s assume for a second that each player chooses whether or not the pennies are heads or tails.) In this game, there is no NE. If Helen picks heads, Stavros will pick heads so the coins are the same and he wins. But given this strategy profile, Helen would pick tails so that the coins differ. There is no strategy profile in which each player would not want to deviate. This is an example of a pure strategy game in which the players have diametrically opposed goals, and there is no NE. The same situation happens in rock paper scissors. Can you see why?

Now look at prisoner’s dilemma. We said earlier that no matter what another player chooses, a player always does better by choosing not to confess. So this game has a unique NE, which is (Don’t Confess, Don’t Confess).

We have seen examples of games using pure strategy that have no NE, more than one NE, and one unique NE. It can be shown that if a player has a finite number of strategies to choose from and can choose different strategies given by a probability distribution, that an NE does exist. As a quick example,

Solving Games

Solving a Game

Many board games with seemingly simple rules have long stumped both hu- mans and computers. Solving games depends on the decision complexity, or the difficulty in making decisions, and the space complexity, or the size of the search space over the possible moves and their consequences.

Good players look ahead to the possible game situations a move might lead to, which is no simple task when the search space reaches billions of billions of possibilities! There is much research in the field of artificial intelligence in creating computers that are able to look many branches ahead in a game tree to determine the best move.

What it means to solve a game

There are three basic types of “solutions” to games:

  1. Ultra-weak The result of perfect play by each side is known, but the strategy is not known specifically.
  2. Weak The result of perfect play and strategy from the start of the game are both known.
  3. Strong The result and strategy are computed for all possible positions.

By perfect play, we mean each player cannot do any better, given the other’s strategy. So to “solve” a game means to find it’s Nash equilibrium.

Iterated Dominance

The method of iterated dominance is one way to find the pure strategy NE (if it exists) for a normal form game.

Consider a game with the following payoffs (player 1 is the row player, player 2 is the column player):

A B C D A (5,2) (2,6) (1,4) (0,4) B (0,0) (3,2) (2,1) (1,1) C (7,0) (2,2) (1,5) (5,1) D (9,5) (1,3) (0,2) (4,8)

Now when player 1 chooses a move, it will not choose A, since it is weakly dominated by C. But player 2 knows player 1 is rational, and knows he will not choose A. So the choices left are:

A B C D A B (0,0) (3,2) (2,1) (1,1) C (7,0) (2,2) (1,5) (5,1) D (9,5) (1,3) (0,2) (4,8)

Now for player 2, D is dominates A, so player 2 won’t choose A.

A B C D A B (3,2) (2,1) (1,1) C (2,2) (1,5) (5,1) D (1,3) (0,2) (4,8)

Now for player 1, B dominates both C and D, so he won’t play either of those:

A B C D A B (3,2) (2,1) (1,1) C D

Player 2 has B dominating both C and D:

  • If there is a node imeediately below the current node colored op- posite than the current player at that node, then color the current node the color of the opposite player.
  • If all immediately lower nodes are colored the same as the current player, also color the node that same color.
  • Else color the node as a tie (we will use gray for a tie).
  1. Move up the tree, repeating this for each ply. (In this case, we already completed the tree with step 2).

Exercises

  • Begin drawing a tic-tac-toe game tree. See how many branches you can color as wins for each player.
  • Consider the “centipede” game, with the following rules: two players alternate taking turns. On each turn, the player can decide to either

continue the game, or stop. A player gets a higher payoff with stopping at turn t compared to the other player stopping the game at time t + 1, but gets an even higher payoff if neither of these events happen (i.e., the game continues).

  • Define the player function for this game.
  • Draw a possible tree for this game (there are many), remembering to label nodes with players and terminal nodes with the payoffs for each player.
  • Write out the possible histories for this game (going only up to period T = 6).
  • What are the possible subgame equilibria?

Appendix

Additional Topics

Complexity

We will often form questions about solving games in terms of some n, usu- ally the size of a game board or a similar measure, and want to know how the number of steps needed to find the solution grows with n. It is worth mentioning a couple things to note about complexity here that we will refer to later on.

Problems are often classified using the following terms:

  • P (polynomial) The problem is solvable in polynomial time. That is, the number of steps needed to find a solution is a polynomial in n.
  • NP (non-deterministic polynomial) These problems are in general very difficult to solve, but once a solution is found it can be verified in polynomial time. All P problems are also NP problems, so P ⊆ NP , but you can get a million dollars if you can prove or disprove whether P=NP or not. One argument that there is reason to believe P=NP is that the existence of a good algorithm for verifying a solution to an NP problem might somehow imply that there is a good algorithm for solving it.

[[wC wN wB wQ wK wB wN wC] [wP wP wP wP wP wP wP wp] [e e e e e e e e] [e e e e e e e e] [e e e e e e e e] [e e e e e e e e] [bP bP bP bP bP bP bP bP] [bC bN bB bW bK bB bN bC]

It is also common to use numbers to represent pieces: 0 for empty, 1 for white pawn, -1 for black pawn, 2 for white castle, -2 for black castle, etc.

A Hex Board (we will meet the game of Hex next week) might be repre- sented as an 11 by 11 array:

[[0 0 0 0 0 1 0 0 0 0 0]

[0 0 0 0 1 1 0 0 0 0 0]

[0 0 0 0 0 0 0 0 0 0 0]

[0 0 0 0 1 0 0 0 0 0 0]

[0 0 0 0 -1 0 0 0 -1 0 -1]

[0 0 0 0 1 0 -1 0 0 0 0]

[0 0 -1 0 -1 0 -1 0 0 0 0]

[0 0 0 1 0 0 0 0 0 0 0]

[-1 0 0 0 0 0 0 0 0 0 0]

[0 0 0 0 0 0 0 0 0 0 0]

[0 0 0 0 0 0 0 0 0 0 0]]

with entries 0 for empty, 1 for black, and -1 for white.

Bit Boards

One interesting scheme used widely in computer chess is the bit board. This representation uses a 64 bit string for each piece, with a 1 if that piece is in the square and a 0 if not. So the bit board for white pawns in the initial chess game state is:

[[0 0 0 0 0 0 0 0] [1 1 1 1 1 1 1 1] [0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0]]

As one example of how this notation can be useful, white can compute all the squares occupied by black by bitwise ORing the bitboards for each of black’s pieces, etc. More resources on this method and a great tutorial applying it to checkers can be found online at http://www.3dkingdoms.com/checkers/bitboards.htm.

Graphs

While not used that widely, it might make sense to represent some game boards on a graph. A good graphing library for python is networkx, but there are plenty out there. You can create a graph by connect adjacent cells with edges and name them by their position on the board. We’ll talk more about graph theory next week.

There will be a full tutorial online about making game boards.

Cheat Sheet

Symbols

  • ∀ - “Forall”
  • ∈ - “In.”
  • ∅ - “Empty set”