Neural Network Black Jack

For a final project in my Neural Networks course, my partner and I created a black jack game in C++ that uses a reinforcement learning neural network. The game has 3 players: the dealer, neural network 1, and either a human player or neural network 2.

For those who don't know what Black Jack is, Blackjack (or Twenty-One) is a card game where the player attempts to beat the dealer by obtaining a sum of card values higher than the dealer's and equal to or less than 21. Each card has the same value as its index except for the ace (which can be counted as 1 or 11, as per the player's choice) and the face cards (which are counted as 10). At the beginning of the game, each player is dealt two cards, one face up and one face down. After looking at his first two cards, the player chooses to draw (hit) or to stop drawing cards (stand). The player may take as many hits as he wants as long as he doesn't bust, i.e., as long as the sum of card values in his hand does not exceed 21. Once all the players have finished their hands, the dealer shows his or her face-down card and draws cards until he/she has a total of 17 or above (the standard strategy in professional Blackjack).

In our implementation of the game, the dealer played based on a flat dealer rule. If the dealer has less than 17, the dealer hits. If the dealer has 17 or more, the dealer stands, unless a player has it beat, in which case the dealer hits. This flat rule served as our benchmark. Also, techniques such as "splitting pairs," "doubling down," etc. were removed for the sake of simplicity. Our goal was to utilize a neural network to play Black Jack, not create a complex Black Jack simulation.

The learning equation for our neural network was:
To clear up any confusion with the equation (or possibly to confuse you even more), I'll take you through an example of our neural network learning. Our reinforcement signal will have one of three values: -1 if the NN busts, +1 if the NN wins, and 0 otherwise. Our learning rate α will be 0.1, and our discount size γ will be 0.9. Also, two terminal states were introduced: s=21 (Q=1) for a perfect score, and s=-1 (Q=-1) for a bust.

NN Black Jack Files:

All source material for © Roger Grayson. Contents may not be cited or reproduced, in whole or in part, without prior written consent of the author. Powered by PHP! Valid XHTML 1.0 Transitional Valid CSS!