snake game with deep learning

Before moving forward, let’s first summarize what we have done till now. In the first blog, we created a snake game using pygame. Then in the next blog, using backpropagation, we let the neural network learn how to play snake game.

In this blog, we will let the genetic algorithm (GA) and neural network(NN) play the snake game (if you are new to genetic algorithm please refer to this blog).

But let’s first clear some doubts which are obvious to come in mind.

What is the advantage of using GA + NN?

We don’t need any training data.

Note: I am not saying this is better than backpropagation but for snake game problem creating a valid training data is a tough task so this is a good option other than reinforcement learning, which we will see later.

Where we will use GA in NN?

Instead of backpropagation, we will update weights using GA.

How we will use GA in NN?

This whole process can be easily summarized in 7 steps:

Creating a snake game and deciding neural network architecture.
Creating an initial population.
Deciding the fitness function.
Play a game for each individual in the population and sort each individual in the population based on the fitness function score.
Select a few top individuals from the population and create the remaining population from these top selected individuals using Crossover and mutation.
The new population is created (meaning the next generation).
Go to step 4 and repeat until the stopping criteria are not satisfied.

Now, I hope most of the things might be clear to you. Let’s see these steps in more detail

Creating a snake game and deciding neural network architecture

Snake game is created with pygame and network architecture is same as that of the previous blog with 7 units in the input layer, 3 units in the output layer with ‘softmax’ and used 2 hidden layers one of 9 units and other of 15 units with ‘relu’ as shown below.

Creating Initial Population

Here, I have chosen 50 individuals in the population and each individual is an array of weights of the neural network. Randomly initialize these individuals. See code below

# n_x   no. of input units
# n_h   no. of units in hidden layer 1
# n_h2  no. of units in hidden layer 2
# n_y   no. of output units

# The population will have sol_per_pop chromosome where each chromosome has num_weights genes.
sol_per_pop = 50
num_weights = n_x*n_h + n_h*n_h2 + n_h2*n_y

# Defining the population size.
pop_size = (sol_per_pop,num_weights)
#Creating the initial population.
new_population = np.random.choice(np.arange(-1,1,step=0.01),size=pop_size,replace=True)

# n_x no. of input units

# n_h no. of units in hidden layer 1

# n_h2 no. of units in hidden layer 2

# n_y no. of output units

# The population will have sol_per_pop chromosome where each chromosome has num_weights genes.

sol_per_pop = 50

num_weights = n_x*n_h + n_h*n_h2 + n_h2*n_y

# Defining the population size.

pop_size = (sol_per_pop,num_weights)

#Creating the initial population.

new_population = np.random.choice(np.arange(-1,1,step=0.01),size=pop_size,replace=True)

Deciding the Fitness Function

You can use any fitness function. Here, I have used the following fitness function

“On every grasp of food, I have given 5000 reward points and if it collides with the boundary or itself, I have awarded a penalty of 150 points“

You can find this inside the run_game_with_ML() code (Last line).

Evaluating the population (Step – 4)

For each individual, a game is played and the fitness function is calculated which is then appended in the list as shown in the code below

def cal_pop_fitness(pop):
    fitness = []
    for i in range(pop.shape[0]):
        fit = run_game_with_ML(display,clock,pop[i])
        print('fitness value of chromosome '+ str(i) +' :  ', fit)
        fitness.append(fit)
    return np.array(fitness)

fitness = cal_pop_fitness(new_population)

def cal_pop_fitness(pop):

fitness = []

for i in range(pop.shape[0]):

fit = run_game_with_ML(display,clock,pop[i])

print('fitness value of chromosome '+ str(i) +' : ', fit)

fitness.append(fit)

return np.array(fitness)

fitness = cal_pop_fitness(new_population)

Selection, Crossover, and Mutation (Step – 5)

Selection: Now, according to fitness value, some best individuals will be selected from the population and are stored in the ‘parents’ array as shown in the code below

def select_mating_pool(pop, fitness, num_parents):
    # Selecting the best individuals in the current generation as parents for producing the offspring of the next generation.
    parents = np.empty((num_parents, pop.shape[1]))
    for parent_num in range(num_parents):
        max_fitness_idx = np.where(fitness == np.max(fitness))
        max_fitness_idx = max_fitness_idx[0][0]
        parents[parent_num, :] = pop[max_fitness_idx, :]
        fitness[max_fitness_idx] = -99999999
    return parents

parents = select_mating_pool(new_population, fitness, num_parents_mating)

def select_mating_pool(pop, fitness, num_parents):

# Selecting the best individuals in the current generation as parents for producing the offspring of the next generation.

parents = np.empty((num_parents, pop.shape[1]))

for parent_num in range(num_parents):

max_fitness_idx = np.where(fitness == np.max(fitness))

max_fitness_idx = max_fitness_idx[0][0]

parents[parent_num, :] = pop[max_fitness_idx, :]

fitness[max_fitness_idx] = -99999999

return parents

parents = select_mating_pool(new_population, fitness, num_parents_mating)

Crossover: To produce children for the next generation, the crossover is used. First, two individuals are randomly selected from the best, then I randomly choose some values from first and some from the second individual to produce new offspring. This process is repeated until the total population size is not achieved as shown below

def crossover(parents, offspring_size):
    offspring = np.empty(offspring_size)

    for k in range(offspring_size[0]):

        while True:
            parent1_idx = random.randint(0, parents.shape[0] - 1)
            parent2_idx = random.randint(0, parents.shape[0] - 1)
            if parent1_idx != parent2_idx:
                for j in range(offspring_size[1]):
                    if random.uniform(0, 1) < 0.5:
                        offspring[k, j] = parents[parent1_idx, j]
                    else:
                        offspring[k, j] = parents[parent2_idx, j]
                break
    return offspring

# Generating next generation using crossover.
offspring_crossover = crossover(parents, offspring_size=(pop_size[0] - parents.shape[0], num_weights))

def crossover(parents, offspring_size):

offspring = np.empty(offspring_size)

for k in range(offspring_size[0]):

while True:

parent1_idx = random.randint(0, parents.shape[0] - 1)

parent2_idx = random.randint(0, parents.shape[0] - 1)

if parent1_idx != parent2_idx:

for j in range(offspring_size[1]):

if random.uniform(0, 1) < 0.5:

offspring[k, j] = parents[parent1_idx, j]

else:

offspring[k, j] = parents[parent2_idx, j]

break

return offspring

# Generating next generation using crossover.

offspring_crossover = crossover(parents, offspring_size=(pop_size[0] - parents.shape[0], num_weights))

Mutation: Then, some variations are being added to the newly formed offspring. Here, for each child, I randomly selected 25 weights and mutated them by adding some random value as shown in the code below

def mutation(offspring_crossover):

    for idx in range(offspring_crossover.shape[0]):
        for _ in range(25):
            i = randint(0,offspring_crossover.shape[1]-1)

        random_value = np.random.choice(np.arange(-1,1,step=0.001),size=(1),replace=False)
        offspring_crossover[idx, i] = offspring_crossover[idx, i] + random_value

    return offspring_crossover

offspring_mutation = mutation(offspring_crossover)

def mutation(offspring_crossover):

for idx in range(offspring_crossover.shape[0]):

for _ in range(25):

i = randint(0,offspring_crossover.shape[1]-1)

random_value = np.random.choice(np.arange(-1,1,step=0.001),size=(1),replace=False)

offspring_crossover[idx, i] = offspring_crossover[idx, i] + random_value

return offspring_crossover

offspring_mutation = mutation(offspring_crossover)

New Population Created

With the help of fitness function, crossover and mutation, new population for the next generation is created. Now, next thing is to replace the previous population with this newly formed. Below is the code for this

# Creating the new population based on the parents and offspring.
new_population[0:parents.shape[0], :] = parents
new_population[parents.shape[0]:, :] = offspring_mutation

# Creating the new population based on the parents and offspring.

new_population[0:parents.shape[0], :] = parents

new_population[parents.shape[0]:, :] = offspring_mutation

Now we will repeat this process until our target for certain game score is not achieved. In this problem, I have used 100 generations for training and 2500 steps in a game, which was able to achieve a maximum score of 40. You can choose more number of steps per game and can achieve more score.

The full code can be found here.

In the next blog, we will see how the genetic algorithm can be applied for neural network architecture search. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

TheAILearner

Mastering Artificial Intelligence

Tag Archives: snake game with deep learning

Snake Game with Genetic Algorithm