Watson’s wagering strategies

Editor’s note: This guest post from IBM Researcher Dr. Gerald Tesauro is the third article in a three-part series about how Watson plays America’s favorite quiz show®.

Daily Doubles and Final Jeopardy! are often the most critical junctures of a Jeopardy! game; the amount wagered can make a big difference in a player’s overall chances to win. How does Watson decide on the amount?

Daily Double wagering

In principle, to compute the best Daily Double (DD) bet, a player must answer two basic questions:

(1) How likely am I to answer the DD clue correctly?

(2) How much will a given bet increase or decrease my winning chances when I get the DD right or wrong?

Match Play

The Watson-Jeopardy Challenge is spread over two games, with combined totals determining the winner. This style of play requires different strategies than a typical game. Final Jeopardy! of game one is analogous to “half time,” so requires different strategies by all competitors, compared to when game two is the last chance to win.

Humans are at best only able to make crude estimates of these quantities. By contrast, Watson uses advanced mathematical models that can answer both questions with far greater precision than humans can achieve.

To address the first question, Watson uses an “in-category DD confidence” model. Based on thousands of tests on historical Jeopardy! categories containing DDs, the model estimates Watson’s DD accuracy, given the number of previously seen clues in the category that Watson got right and wrong.

Watson tackles the second question by using a Game State Evaluator (GSE), a complex regression model that estimates Watson’s winning chances at any stage of the game, given the information set that describes the current game state (for example, the scores of the three players, the number of remaining clues, the value of remaining clues, and the number of remaining DDs).

The GSE was trained over the course of millions of simulated Jeopardy! contests pitting Watson vs. two simulated human opponents. The human opponent models in these simulations capture important statistical profiles of human contestants, such as how often contestants attempt to buzz in; how often they are right when they win the buzz; their accuracy on DDs and Final Jeopardy!.

Optimal wagering

By combining the GSE with the in-category DD confidence, Watson can compute an overall expected chance to win the game for any given DD bet. This analysis runs for every legal betting amount – from the $5 DD minimum, to its entire bankroll for a True Daily Double – to come up with an optimal amount. The calculation also uses risk analytics to trade off expected winning chances against the risk of a particular bet.

Watson’s resulting bet might seem unusual, in that it frequently may be far more aggressive, or far more conservative, than typical human bets. The amount may also take on non-round values (i.e., not an exact multiple of $100). Such values may make the arithmetic a little more challenging for the humans when computing their bets.

Final Jeopardy! wagering

In calculating a Final Jeopardy! (FJ) wager, Watson first needs to know if it is playing a single game or a two-game match [see Call out box: Match Play]. In the latter case, Watson will use very different strategies for game one and game two. The analysis for game one is similar to Daily Double analysis: Watson uses a statistical model of likely human bets, human FJ accuracy, and Watson’s FJ accuracy to calculate its expected winning chances for every legal bet. It then selects the bet giving the best risk-adjusted chance to win the match.

While there are no previously revealed clues in the FJ round, Watson does obtain evidence of its likely FJ accuracy from the category title. Given the title, Watson first computes several salient features via Natural Language Processing analysis. It then consults a “FJ prior accuracy” regression model, based on Watson’s performance on thousands of historical FJ categories, to predict Watson’s accuracy given the category features.

Wagering in game two of a match is similar to FJ in ordinary games. The predominant consideration is score positioning (first, second or third place). In some cases, the contestants may need to use strategic reasoning as in games like Rock-Paper-Scissors – predict the opponents’ bets, while taking into account the fact that the opponents are also trying to predict their bets.

Watson has been programmed with a library of known FJ strategy rules, such as Two-Thirds Betting and Shore’s Conjecture. The research team also added novel rules for some special situations which we discovered.[1]

Depending on the situation, Watson will either bet according to a suitable strategy rule, or it will run a real-time simulation to calculate the best bet, among all legal bets. For the match with Ken and Brad, Watson will also take into account the prize values for second place ($300,000) and third place ($200,000), leading to a different objective than simply trying to win the match.

[1] One such rule in ordinary FJ applies when the leader’s score exactly equals the sum of the other two players’ scores, for example, if Watson has $20,000 and the two humans have $13,000 and $7,000. Watson would normally bet $6,001, to win by $1 when the second place player doubles her score. However, in this case Watson will bet $6,000 to tie for first place. The reason is that if Watson bets $6,001 and is wrong, it gives the third place player a chance to win by $1 ($14,000 to $13,999) if the second place player is wrong.