A03: Connect Four AI
This assignment helps you practice with adversarial search, specifically minimax with alpha-beta pruning. Your task is to improve a given Connect Four AI.
The game Connect Four is a classic example of zero-sum, deterministic, perfect information games. Just like tic-tac-toe, both players can see the entire state of the game, there is no randomness involved (no dice, for example), and only one player can win all the points (so it’s zero-sum).
We’ll use the following simple rules for our variation of the game:
- The game is played on a grid of 7 horizontal positions and 6 vertical positions.
- Players alternate adding one token to the grid. A token may be added to any column that is not full, and the token falls to the lowest free slot on that column.
- The first player to get four tokens in a row horizontally or vertically or diagonally with no gaps wins.
Because Connect Four is a zero-sum, deterministic, perfect information game, we can build an AI with minimax. Of course, since we’re not crazy, we’ll use alpha-beta pruning as well. Recall that alpha-beta pruning is an optimization that doesn’t change the best move that minimax would normally find.
I have implemented the Connect Four game and an AI. Find it on londo at
Analysis of AI performance
Your task is to improve the AI that’s coded in the Python script. You are not asked to change the fundamental algorithm, but rather make it search fewer states. In order to test the number of states searched, the Python script has additional code for running experiments.
connect-four.py code has two functions,
experiment_improved(), that generate various random starting configurations with different numbers of pre-established moves. The first function,
experiment(), runs vanilla minimax and minimax with alpha-beta pruning. We would expect that alpha-beta pruning would result in significantly fewer examined states. The other function,
experiment_improved(), runs minimax with alpha-beta pruning against your improved version of the AI (see the “Task” section below).
Regarding the minimax with/without alpha-beta experment: Note that if we allow the minimax-driven AI to search all possible moves (to maximum depth), the AI would produce perfect play. However, that’s not possible since there are about 4.5 quadrillion possible leaf nodes in the search tree. Limiting the depth of the search tree solves our problem and gives a good (though not perfect) AI.
A summary of the benefit of alpha-beta pruning is shown the plot below. Several experiments at different depth limits are shown. We see that different depth limits result in different kinds of savings with alpha-beta. Specifically, very deep search (six moves ahead) results in extreme savings: with no pre-established moves, alpha-beta pruning reduces the search space by about 99% (about 135k searched states vs. about 1300 searched states).
Your task is to improve the provided Connect Four AI implementation, which uses minimax plus alpha-beta pruning, so that the AI is even more efficient but also wins more games. Your improvements must reduce the number of states examined (with statistical significance) and win more than 50% of the games against the default minimax alpha-beta AI that’s already implemented.
- Examine the code in
connect-four.py. You’ll find three functions that effectively do nothing (each marked with TODO):
- Write new definitions for those three functions that produce a more efficient and successful search.
Test your improvements by running the
run-connect-four-improved-analysis.sh script. That script will run
connect-four.py and save the output to a CSV file; it will then produce a statistical summary of the results plus a plot. This script takes about three minutes to run. Here is an example of its output:
moves, improved_won, alphabeta, improved, pct_improvement, firstmove 0, 0, 329, 342, -3.95, ab 1, 0, 324, 356, -9.88, ab ... Running R...  "New strategy win percent (you want this > 50): 64.07%"  "Average improvement in efficiency (you want this > 0): 3.13%"  "p-value of efficiency differences (you want this < 0.05): 0.00"
Turn in the improved
Out of 5 points:
- 5 pts: Your improved functions result in fewer states checked (improved efficiency), with statistical significance (p-value < 0.05) and produce wins more than 50% of the time.
- 4 pts: Your improved functions result produce wins more than 50% of the time, even though they are not more efficient.
- 3 pts: Your improved functions result in fewer states checked (more efficient), but fail to produce wins.
- 2 pts: Your improved functions are neither more efficient nor produce more wins, but you have correctly implemented some good ideas in the improved functions (this is a judgment call).
- 1 pt: Regardless of the quality of your improvements, the code does not run to completion.
- 0 pts: No submission.
Consider the impact each of the following have on minimax and alpha-beta pruning:
- the order that “possible transitions” (next moves) are examined, depending on whether the search is minimizing or maximizing, and how alpha-beta pruning is affected by this order; the default
possible_transitionsfunction randomizes its output just to magnify any improvement you can think of making
- the accuracy of the utility function
- the accuracy of the estimated utility function
Also, consider modifying the last lines in the Python code to execute
game_loop() instead of the
experiment_improved() function. The game loop function will allow you to play the game interactively.