This notebook contains a simple implementation of a genetric algorithm (GA) to solve the Travelling Salesman Problem.
GAs belong to the family of evolutionary metaheuristics, which are based on the "survival of the fittest". Each solution is represented by a chromosome, which consists of a sequence of genes that represent a solution to the problem.
GAs operate upon a population of solutions, which are manipulated over several iterations, which are called generations.
Better solutions are progressively identified, as the algorithm pairs parent chromosomes to produce offspring, or applies random mutations on previously generated chromosomes.
There are 5 main steps to the GA process after we initialise the parameters:
When the population has converged (where the offspring are not significantly different to the parents), the algorithm can be terminated.
We will be using Distributed Evolutionary Algorithms in Python (DEAP) library.
!pip install deap
Requirement already satisfied: deap in c:\users\hc6118\anaconda3\envs\tsenv\lib\site-packages (1.3.1) Requirement already satisfied: numpy in c:\users\hc6118\anaconda3\envs\tsenv\lib\site-packages (from deap) (1.19.2)
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
from sklearn.datasets import make_blobs
%matplotlib inline
We will first create a map of the 15 homes with two centres using make_blobs. The function of make_blob from sklearn (or scikit-learn) allows to create spatially normally distributed points with given number of centres.
For this example, we will set two clusters and the standard deviation of the cluster to be 20 based on the random generator seed of 2. By setting a value for the random_state parameter, the results can be reproduced for future comparison of the results.
plot_size = 15
plot_width = 16
plot_height = 8
params = {'legend.fontsize': 'large',
'figure.figsize': (plot_width,plot_height),
'axes.labelsize': plot_size,
'axes.titlesize': plot_size,
'xtick.labelsize': plot_size*0.75,
'ytick.labelsize': plot_size*0.75,
'axes.titlepad': 25}
plt.rcParams.update(params)
plt.rcParams.update(params)
num_homes = 15
center_box = (100, 200)
homes_coord, _ = make_blobs(n_samples=num_homes,
centers=2,
cluster_std=20,
center_box=center_box,
random_state = 2)
homes_names = [i for i in range(num_homes)]
homes_coord_dict = {name: coord for name,coord in zip(homes_names, homes_coord)}
plt.scatter(homes_coord[:, 0], homes_coord[:, 1], s=plot_size*2, cmap='viridis');
We will then calculate the distances between the homes within the created map. The Euclidean distance gives us the length of the line segment between two points. These distances will form as an input to our genetic algorithm.
from scipy.spatial import distance
dist_matrix = distance.cdist(homes_coord, homes_coord, 'euclidean')
We now have:
Using the DEAP library, we will solve our travelling salesman problem using genetic algorithm.
from deap import base, creator, tools
We will create the necessary functions to solve our problem.
Firstly, we will need to create our population of chromosomes. The function to achieve this, is the following:
import copy
np.random.seed(3) # we are setting the same random seed to be able to reproduce the answers.
def chromo_create(_homes_names):
chromo = copy.deepcopy(_homes_names)
np.random.shuffle(chromo)
return chromo
We also need a generic function that will evaluate the overall distance the travelling salesperson will be covering based on the current route to be able to compare between each iteration.
def chromo_eval(_dist_matrix, _chromo):
dist = 0
for p in range(len(_chromo) - 1):
_i = _chromo[p]
_j = _chromo[p+1]
dist += _dist_matrix[_i][_j]
dist += dist_matrix[_chromo[-1], _chromo[0]]
return dist,
The second step of GA is to evaluate the fitness of all chromosomes in the population. We can use the embedded functions from the DEAP library.
tb = base.Toolbox()
creator.create('Fitness_Func', base.Fitness, weights=(-1.0,))
creator.create('Individual', list, fitness=creator.Fitness_Func)
Now that we have set up the necessary functions, we will start with our solver.
Before we start, we need to initialise the parameters for the genetic algorithm to function. We are specifying the solution population size, number of generations, the crossover probability, and the mutation probability.
num_population = 200
num_generations = 1000
prob_crossover = .4
prob_mutation = .6
Step 1: We will generate our population. Using the function chromo_create we wrote above, we will create the population based on the home IDs we created in the beginning. We then will need to register all this information to our Toolbox.
tb.register('indexes', chromo_create, homes_names)
tb.register('individual', tools.initIterate, creator.Individual, tb.indexes)
tb.register('population', tools.initRepeat, list, tb.individual)
tb.register('evaluate', chromo_eval, dist_matrix)
tb.register('select', tools.selTournament)
tb.register('mate', tools.cxPartialyMatched)
tb.register('mutate', tools.mutShuffleIndexes)
population = tb.population(n=num_population)
Step 2: We will calculate the fitness of all chromosomes in the population.
fitness_set = list(tb.map(tb.evaluate, population))
for ind, fit in zip(population, fitness_set):
ind.fitness.values = fit
We create a for-loop to iterate for the number of generations we set in our initialisation.
best_fit_list = []
best_sol_list = []
best_fit = np.Inf
for gen in range(0, num_generations):
if (gen % 50 == 0):
print(f'Generation: {gen:4} | Fitness: {best_fit:.2f}' ) # print the generation and their fitness level
offspring = tb.select(population, len(population), tournsize=3)
offspring = list(map(tb.clone, offspring))
for child1, child2 in zip(offspring[0::2], offspring[1::2]):
if np.random.random() < prob_crossover:
tb.mate(child1, child2)
del child1.fitness.values
del child2.fitness.values
for chromo in offspring:
if np.random.random() < prob_mutation:
tb.mutate(chromo, indpb=0.01)
del chromo.fitness.values
invalid_ind = [ind for ind in offspring if not ind.fitness.valid]
fitness_set = map(tb.evaluate, invalid_ind)
for ind, fit in zip(invalid_ind, fitness_set):
ind.fitness.values = fit
population[:] = offspring
curr_best_sol = tools.selBest(population, 1)[0]
curr_best_fit = curr_best_sol.fitness.values[0]
if curr_best_fit < best_fit:
best_sol = curr_best_sol
best_fit = curr_best_fit
best_fit_list.append(best_fit)
best_sol_list.append(best_sol)
Generation: 0 | Fitness: inf Generation: 50 | Fitness: 327.09 Generation: 100 | Fitness: 327.09 Generation: 150 | Fitness: 327.09 Generation: 200 | Fitness: 327.09 Generation: 250 | Fitness: 327.09 Generation: 300 | Fitness: 327.09 Generation: 350 | Fitness: 327.09 Generation: 400 | Fitness: 327.09 Generation: 450 | Fitness: 327.09 Generation: 500 | Fitness: 327.09 Generation: 550 | Fitness: 327.09 Generation: 600 | Fitness: 327.09 Generation: 650 | Fitness: 327.09 Generation: 700 | Fitness: 327.09 Generation: 750 | Fitness: 327.09 Generation: 800 | Fitness: 327.09 Generation: 850 | Fitness: 327.09 Generation: 900 | Fitness: 327.09 Generation: 950 | Fitness: 327.09
We will plot the fitness level for each generation to see if the population has converged after the limited number of generations we have set.
plt.plot(best_fit_list)
plt.show()
We can see the population has converged!
So we will now print the best solution for the tour route ofor Santa Claus to cover all 15 homes the quickest and return to his origin.
print(best_sol)
[1, 14, 2, 6, 9, 8, 13, 11, 12, 0, 3, 4, 7, 5, 10]
We will create a plot to see what such route would look like.
final_sol = best_sol + best_sol[0:1]
plt.scatter(homes_coord[:, 0],
homes_coord[:, 1],
s=plot_size*2,
cmap='viridis',
zorder = 10000);
for i, txt in enumerate(homes_names):
plt.annotate(txt, (homes_coord[i, 0]+1, homes_coord[i, 1]+1))
lines = []
for p in range(len(final_sol) - 1):
i = final_sol[p]
j = final_sol[p+1]
colour = 'black'
plt.arrow(homes_coord[i][0],
homes_coord[i][1],
homes_coord[j][0] - homes_coord[i][0],
homes_coord[j][1] - homes_coord[i][1],
color=colour)