Games, Strategies, and GTO Strategies

This is Part 1 of 6 of an adaptation of my chapter “Game Theory Optimal Strategies: What Are They Good For?” from Excelling at No-Limit Hold’em edited by Jonathan Little.

Much of the reason I wrote Expert Heads Up NLHE was to explain the ideas of game theory, poorly understood in the community at the time, to the average poker player. Heads up no limit (HUNL) is my game of choice personally, so it made sense to use it as the primary example. However, HUNL is something of a simple case, and there’s a bit more to be said about how game theory applies to other games. In this chapter, I’ll give a quick introduction to game theory as it applies to a variety of common poker formats. We’ll see when it’s useful, and more importantly, when it’s not – when it’s appropriate to use game theory-inspired strategies, and when it just can’t really guide our play. I promise to cover a practical skill or two as well.

Games, Strategies, and GTO Strategies

So what is game theory optimal (GTO) play? First of all, people tend to get hung up on the word optimal, so I want to dispell some common misconceptions. Imagine this – there’s some mathematician. She’s made up some potentially useful concept with a moderately complicated definition, and she wants to discuss it with other people. What does she do? Well, first, she probably needs to give her concept a name. That way, she can just say, “Suppose I have a continuous function $f(x)$ instead of “Suppose I have a function $f(x)$ such that, at every point $a$ on its domain, the limit of $f(x)$ as $x$ approaches $a$ in the domain equals $f(a)$ .” Much easier, right? Now don’t worry – you don’t need to know anything about functions, continuous or otherwise, to read this chapter. The point is that the word “continuous” wasn’t made up from scratch – it was a pre-existing word in spoken English that means something only vaguely related to what the mathematician actually wants you to think about when you hear “.

The “O” in GTO is like that. There’s a very specific technical definition for “GTO strategy” which we’ll get to shortly. We could have decided to call these strategies crunchy or yellow or Vulcan, but hopefully game theory optimal is a little more evocative of what we mean, even if it isn’t perfect. So please forget any preconceived bias you have about the word optimal. In this chapter, GTO means exactly the following, no more and no less.

Ok so suppose you have some players playing a game, and you have a set of strategies (one for each player) such that no player can improve his EV by changing his strategy. Then, we say that any one of those players’ strategies is a GTO strategy for that player in that game. Great. In a minute, we’ll tease out some consequences of that definition: what special properties such a strategy has, etc. But first, if you’re paying attention, you might feel like you’ve been cheated! I told you that “GTO” has a very specific technical meaning, but then I gave you a definition that relies on more fuzzy terms: game and strategy. As you may guess, we mean something specific by those terms as well. Let’s talk about those ideas and then come back to GTO. We’ll say something more about EV in the future as well.

I’m going to tweak the next couple definitions a little bit to make them more useful for poker. For us, a “game” will correspond more or less to a single hand. It is composed of the following four things. It’s: \begin{itemize} \item A set of players \item Starting ranges for each player \item A decision tree that describes all the possible sequences of actions that the players (and Nature, i.e., random chance) can take, and \item Payoffs that describe how much money or chips or value each player has at the end of the hand, for every way the hand can end \end{itemize} When we describe a game, we’ll also usually want to specify the starting pot and stack sizes of each player, although presumably we could find them by starting at the bottom of a decision tree (at the end of the hand) and working back up the series of actions to the beginning, tallying bets as we go.

A player’s range tells us the different hands he can hold as well as how likely each of them is. A player’s starting range is his range at the beginning of the game. Of course, a player’s possible holdings at the beginning of a holdem hand are well-known, so we often won’t need to specify them. However, we’ll sometimes find it convenient to set up sort of artificial games that describe play over just part of a hand. For example, we could draw a decision tree that describes play on just a single river. In that case, we’ll need to specify the ranges of each player at the start of river play to fully describe the situation.

I should say what a decision tree is! A picture is best. Check out the figure below. This picture corresponds to a game with 3 players, named BU, SB, and BB. There are two components in the diagram: circles and lines. Each circle in the tree represents a spot where a player has to make a decision – we call them decision points. More specifically, each decision point corresponds to a distinct set of public information – the information you’d have available if you were a third party watching the game (with no hole card cam) – basically everything except the hole cards. I’ve labelled each point with the name of the player who owns it, i.e., who gets to make a decision there. Each arrow leaving a point represents an action the player can choose, and when he takes an action, the game moves to the point indicated by the arrow.

Larger game tree example

The game begins at the top of the tree. (Here, I’ve neglected to draw actions for posting blinds, but they’re implied.) Then, BU can fold, call, or raise. If he folds, the SB also has the options to fold, call or raise. If the SB calls, the action moves to a point owned by the BB. And so on. Points all the way at the bottom of the tree (which are arrived at the end of a hand, i.e. at showdown or after all but one player folds) are known as the leaves of the tree. (Get it?) A tree describing all of the possible lines, including all future streets and and so on, would be a bit unwieldy, so I’ve left dangling arrows to indicate places where much more lies below, undrawn. You can imagine how it would go.

So that’s a game. Strategy is another word that has an English meaning that’s close to but not quite the same as its technical definition. For us, a strategy for a player is something that tells him exactly how to make every decision he could face in the game. Practically, it tells him, for every one of his decision points and every hole card combination that doesn’t conflict with the board, how he will choose between each of the options available to him there. Now, we could imagine some fairly convoluted decision making processes, but we’ll generally restrict ourselves to one of the two following types. If a player takes one action all the time (with a particular hand at a particular point) we say he’s playing a pure strategy there, and if he chooses randomly between multiple options with certain probability (say fold $30\%$ and call $70\%$ ), then he’s playing a mixed strategy.

Now, if we know a player’s strategy, we can find his range at any point in the game. We have his starting range, and then at each of his decision points, he splits the range with which he arrives there. He chooses an action to take for each component of his range. If we know a player’s range for taking each action, we can often more or less work out his strategy. For example, if we know he arrives at a point with $20\%$ of a hand, and his range for taking one action includes $15\%$ of the hand and the other includes $5\%$ , then we can reason that at that point, his strategy involves taking the first action three-quarters of the time and the second one-quarter of the time. However, if a player arrives at a point with $0\%$ of a hand (because his strategy is such that he never gets to this spot with this hand), then all of his subsequent action ranges must also contain $0\%$ of the hand. His strategy, by definition, must dictate his play here, but we can’t use his ranges to figure out his frequencies.

So if we know a strategy, we can find the ranges, and if we know ranges, we can work out parts of the strategy – those that we might consider most important – the parts that describe play in spots the players can actually get to when they play their strategies. For practical purposes, when we describe players’ strategies, we’ll usually talk about their ranges, but to be clear, they’re not exactly the same thing.

Great, now we’re ready to revisit GTO in full force. So again, a set of strategies is GTO if no player can unilaterally deviate and increase his average profit. An equivalent way to put this is to say that every player is playing maximally exploitably (i.e. as profitably as possible), given his opponents’ strategies. So, if all players but one in a game are playing strategies from a GTO set, then the last player can do no better than to also play his strategy from the set. A set of GTO strategies is also called an equilibrium or a Nash equilibrium, and if all players are playing their strategy from an equilibrium, we say we’re at equilibrium.

Let’s take a look at one consequence of these definitions that many players find counterintuitive. This isn’t super important in and of itself, but it’ll help us to become more familiar with the concepts. A GTO strategy can involve folding the nuts, even on the river. Suppose we’re at equilibrium. No player has any incentive to change his strategy. Imagine taking Hero’s strategy in a spot that play never reaches and tweaking it so that it folds the nuts a small amount of the time. By “small” here, I mean that we don’t start playing poorly enough that our opponents actually can improve their EV by switching up their play to arrive at that spot. Well then the tweaked strategy is still GTO, since it’s still the case no player can increase his EV by unilaterally deviating. Folding the nuts on the river doesn’t affect our EV if it’s in a spot we never get to at equilibrium. However, if we did get there (perhaps because Villain played a non-GTO strategy), we could find ourselves folding the nuts despite playing a GTO strategy.

This is a pretty good example of how the normal English meaning of “optimal” conflicts with our definition. Few people would call folding the nuts on the river optimal, but such play is consistent with a GTO strategy. By the way, notice that in the previous paragraph, we imagined constructing two distinct strategies for a player, and we said both were GTO. Indeed, there is no reason to think that GTO strategies are unique, and they’re often not. This point will become important for us shortly.

The next section, GTO Play in Cash Games and Tournaments, will be posted eventually, and the full book is available now: Excelling at No-Limit Hold’em.