Developing a neural network to play a snake game usually consists of three steps.
- Training data generation
- Training neural network
- Testing
The full code can be found here
In this tutorial, I will guide you to generate training data. To do this, first, we need to develop a snake game for which you can follow this blog.
Training data consists of inputs and corresponding outputs. Here, I have used the following inputs and outputs.
Input is comprised of 7 nodes:
- Is left blocked or is there any obstacle in left ( 1 or 0)
- is front blocked or is there any obstacle in front (1 or 0)
- Is right blocked or is there any obstacle in right(1 or 0)
- Apple direction vector from snake (X)
- Apple direction vector from snake (Y)
- Snake’s current direction vector (X)
- Snake’s current direction vector (Y)
our input data will look like this:
The output is comprised of 3 node:
- [1,0,0] will move snake left
- [0,1,0] will continue snake in same direction
- [0,0,1] will move snake right
Now the big question, how to generate this data? You can sit and play as many games as you can, but it is always good when you can generate data automatically. Let’s see how to do this.
Generating Training Data
Here I have generated training data automatically. To do this I have used angle between snake and apple. On the basis of that angle, I have decided in which direction snake should move. First, let’s calculate these.
Calculating angle b/w snake and apple:
To calculate the angle between snake and apple we only require two parameters, snake position and apple position.
In the following code, I have first calculated the snake’s current direction vector and Apple’s direction from the snake’s current position. Snake direction vector can be calculated by simply subtracting 0th index of the snake’s list from the 1st index. And to calculate apple direction from the snake, just subtract 0th index of snake’s list from Apple’s position.
Then normalize these direction vectors and calculate the angle with the help of the math library. The code is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
def angle_with_apple(snake_position, apple_position): apple_direction_vector = np.array(apple_position)-np.array(snake_position[0]) snake_direction_vector = np.array(snake_position[0])-np.array(snake_position[1]) norm_of_apple_direction_vector = np.linalg.norm(apple_direction_vector) norm_of_snake_direction_vector = np.linalg.norm(snake_direction_vector) if norm_of_apple_direction_vector == 0: norm_of_apple_direction_vector = 10 if norm_of_snake_direction_vector == 0: norm_of_snake_direction_vector = 10 apple_direction_vector_normalized = apple_direction_vector/norm_of_apple_direction_vector snake_direction_vector_normalized = snake_direction_vector/norm_of_snake_direction_vector angle = math.atan2(apple_direction_vector_normalized[1] * snake_direction_vector_normalized[0] - apple_direction_vector_normalized[0] * snake_direction_vector_normalized[1], apple_direction_vector_normalized[1] * snake_direction_vector_normalized[1] + apple_direction_vector_normalized[0] * snake_direction_vector_normalized[0]) / math.pi return angle |
After calculating the angle, next thing is to decide in which direction snake should move.
Calculating direction according to the angle:
If above-calculated angle > 0, this means Apple is on the right side of the snake. So snake should move to the right. For < 0, move left and =0 means continue in same direction. I have used 1, – 1 and 0 for the right, left and front respectively.
I have used the following steps to get the correct button direction (up, down, right, left or 3, 2, 1, 0 respectively) for the next step of the snake.
- First, I have calculated the snake’s current direction.
- Then to turn the snake to the left or right direction, I have calculated left direction vector or right direction vector from snake’s current direction vector.
- Then I have converted the above-calculated direction vector into the button direction.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
def generate_next_direction(snake_position, angle_with_apple): direction = 0 if angle_with_apple > 0: direction = 1 elif angle_with_apple < 0: direction = -1 else: direction = 0 current_direction_vector = np.array(snake_position[0])-np.array(snake_position[1]) left_direction_vector = np.array([current_direction_vector[1],-current_direction_vector[0]]) right_direction_vector = np.array([-current_direction_vector[1], current_direction_vector[0]]) new_direction = current_direction_vector if direction == -1: new_direction = left_direction_vector if direction == 1: new_direction = right_direction_vector button_direction = generate_button_direction(new_direction) return direction, button_direction def generate_button_direction(new_direction): button_direction = 0 if new_direction.tolist() == [10,0]: button_direction = 1 elif new_direction.tolist() == [-10,0]: button_direction = 0 elif new_direction.tolist() == [0,10]: button_direction = 2 else: button_direction = 3 return button_direction |
Now, for every step, angle and corresponding next direction are calculated and snake moves according to that. And for each step inputs and outputs are calculated which are appended to a list of training data.To generate training data, we need to keep a record of 7 inputs and 3 outputs for every step the snake takes. First, let’s see how I have calculated the inputs for every step the snake takes.
- To check if the direction is blocked, we look one step ahead in each direction.
- Snake direction vector = Snake’s Head (0th index) – Snake’s 1st index
- Apple direction from the snake = Apple’s position – Snake’s head position (See the figure below)
For every step, the output is generated by first calculating the direction for the given snake and apple position, using angle between them. Now, we need to convert our directions( -1, 0 or 1 ) to output(Y), a one hot vector. For every predicted direction we need to see that if that direction is blocked or not and according to that create output (Y) for training data. The code given below seems to be a bit longer but it calculates our training data output (Y).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
def generate_training_data_y(snake_position, angle_with_apple, button_direction, direction, training_data_y, is_front_blocked, is_left_blocked ,is_right_blocked): if direction == -1: if is_left_blocked == 1: if is_front_blocked == 1 and is_right_blocked == 0: direction, button_direction = direction_vector(snake_position, angle_with_apple, 1) training_data_y.append([0,0,1]) elif is_front_blocked == 0 and is_right_blocked == 1: direction, button_direction = direction_vector(snake_position, angle_with_apple, 0) training_data_y.append([0,1,0]) elif is_front_blocked == 0 and is_right_blocked == 0: direction, button_direction = direction_vector(snake_position, angle_with_apple, 1) training_data_y.append([0,0,1]) else: training_data_y.append([1,0,0]) elif direction == 0: if is_front_blocked == 1: if is_left_blocked == 1 and is_right_blocked == 0: direction, button_direction = direction_vector(snake_position, angle_with_apple, 1) training_data_y.append([0,0,1]) elif is_left_blocked == 0 and is_right_blocked == 1: direction, button_direction = direction_vector(snake_position, angle_with_apple, -1) training_data_y.append([1,0,0]) elif is_left_blocked == 0 and is_right_blocked == 0: training_data_y.append([0,0,1]) direction, button_direction = direction_vector(snake_position, angle_with_apple, 1) else: training_data_y.append([0,1,0]) else: if is_right_blocked == 1: if is_left_blocked == 1 and is_front_blocked == 0: direction, button_direction = direction_vector(snake_position, angle_with_apple, 0) training_data_y.append([0,1,0]) elif is_left_blocked == 0 and is_front_blocked == 1: direction, button_direction = direction_vector(snake_position, angle_with_apple, -1) training_data_y.append([1,0,0]) elif is_left_blocked == 0 and is_front_blocked == 0: direction, button_direction = direction_vector(snake_position, angle_with_apple, -1) training_data_y.append([1,0,0]) else: training_data_y.append([0,0,1]) return direction, button_direction, training_data_y |
Here, I have used 1000 games for generating training data, each of which consists of 2000 steps. For every game, I have re-initialized snake position, apple position, and score. Then, created two empty lists, one for input training data(X) and another output training data(Y), those will contain our whole training data.The code is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
def generate_training_data(display,clock): training_data_x = [] training_data_y = [] training_games = 1000 steps_per_game = 2000 for _ in tqdm(range(training_games)): snake_start, snake_position, apple_position, score = starting_positions() prev_apple_distance = apple_distance_from_snake(apple_position, snake_position) for _ in range(steps_per_game): angle, snake_direction_vector, apple_direction_vector_normalized, snake_direction_vector_normalized = angle_with_apple(snake_position, apple_position) direction, button_direction = generate_random_direction(snake_position, angle) current_direction_vector, is_front_blocked, is_left_blocked ,is_right_blocked = blocked_directions(snake_position) direction, button_direction, training_data_y = generate_training_data_y(snake_position, angle_with_apple, button_direction, direction, training_data_y, is_front_blocked, is_left_blocked ,is_right_blocked) if is_front_blocked == 1 and is_left_blocked == 1 and is_right_blocked == 1: break training_data_x.append([is_left_blocked, is_front_blocked, is_right_blocked,apple_direction_vector_normalized[0], \ snake_direction_vector_normalized[0], apple_direction_vector_normalized[1], \ snake_direction_vector_normalized[1]]) snake_position, apple_position, score = play_game(snake_start, snake_position, apple_position, button_direction, score,display,clock) return training_data_x, training_data_y |
You might have got some feeling about the training data generation for the snake game with deep learning. In the next blog, we will use this data to train and test our neural network. Hope you enjoy reading.
If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.