Genetic Algorithms

The genetic algorithm is a probabilistic search algorithm that iteratively transforms a set (called a population) of mathematical objects (typically fixed-length binary character strings), each with an associated fitness value, into a new population of offspring objects using the Darwinian principle of natural selection and using operations that are patterned after naturally occurring genetic operations, such as crossover (sexual recombination) and mutation.)

In this section, we will demonstrate some basic implementations of Simple Genetic Algorithms (SGA), Real-valued GA, as well as the use of GA to generate permutations. At the end, we’ll show how real-valued GA can be applied to a shortest-path search.

Example: Bike pricing using Simple Genetic Algorithm

Assume that a company wants to design a new bicycle, and needs to consider the following costs:

  • $700,000 for manufacturing setup, marketing, etc.

  • $110 for each bike produced

Based on similar bikes, the company expects sales to follow this demand curve:

\(Q= 70,000 - 200P\),
where \(P\) is the price, and \(Q\) is the number of units sold at that price

For example:

  • At $0, the company would give away 70,000 bikes for free

  • At $300, the company would sell \(70,000-200*300=10,000\) bikes

  • At $350, the company wouldn’t sell at anything at all.

Profit would be calculated as following:


Simple Genetic Algorithms represent genes as binary values, and mutates those genes through binary operations. Essentially, they replicate the behaviour of genotypes in genetics, where genotypes (actual DNA coding) produce phenotypes (genetic traits).

import random
import math
from tqdm.notebook import tqdm
from copy import copy
import matplotlib.pyplot as plt

Initial Population

# Generate an initial random population
def init_pop(pop_size, chromosome_length):
    ints = [random.randint(0,350) for i in range(pop_size)]
    strs = [bin(n)[2:].zfill(chromosome_length) for n in ints]
    bins = [[int(x) for x in n] for n in strs]
    return bins

Fitness Function

# For a given population, calculate the fitness of each element in the population
def fitness_score(population):
    fitness_values = []
    num = []
    for i in range(len(population)):
        num.append(int("".join(str(x) for x in population[i]), base=2))  # convert binary to decimal
        fitness_values.append(-200 * math.pow(num[i],2) + 92000 * num[i] - 8400000)
    tuples = zip(*sorted(zip(fitness_values,population),reverse=True))
    fitness_values, population = [list(t) for t in tuples]
    return fitness_values, population

This fitness function essentially determines how “good” a particular offspring is. It first converts each unit in the population to a binary number (the genotype), and then returns the “best” offspring.

Parent Selection, Crossover, and Mutation

# Random parent selection
def select_parent(population, num_parents):
    parents=random.sample(population, num_parents)
    return parents
# Apply 1-point crossover
def crossover(parents, crossover_prob): 
    chromosome_length = len(parents[0]) 
    if crossover_prob > random.random():
        cross_point = random.randint(0,chromosome_length)
        parents+= tuple([(parents[0][0:cross_point +1] +parents[1][cross_point+1:(chromosome_length+1)])])
        parents+= tuple([(parents[1][0:cross_point +1] +parents[0][cross_point+1:(chromosome_length+1)])])
    return parents

The crossover function here first determines if a crossover takes place (using the crossover_prob) and then selects a random crossover point in the chromosome. Using 1-point crossover, the bottom halves are switched the parents.

# Alter each gene independently with a probability mutation_prob
def mutation(population, mutation_prob) :
    chromosome_length = len(population[0])
    for i in range(len(population)-1) :
        for j in range(chromosome_length-1) :
            if mutation_prob > random.random():
                if population[i][j]==1:
    return population

Solution Function

def SGA(population, num_gen, num_parents, crossover_prob, mutation_prob, use_tqdm = False):
    states = []
    best_solution = []
    best_score = 0
    if use_tqdm: pbar = tqdm(total=num_gen)
    for _ in range(num_gen):
        if use_tqdm: pbar.update()
        # Get population fitness
        scores, population = fitness_score(population)
        current_best_score = scores[0]
        current_best_solution = population[0]
        if current_best_score > best_score: 
            best_score = current_best_score
            best_solution = int("".join(str(x) for x in copy(current_best_solution)), base=2)
        parents = select_parent(population, num_parents)
        parents = crossover(parents, crossover_prob)
        population = mutation(population,mutation_prob)
    return best_solution, best_score, states


# SGA Parameters
num_gen = 10000
pop_size = 5
crossover_prob = 0.7
mutation_prob = 0.3
num_parents = 2

# Solution representation
chromosome_length = 9
best_score = -100000 

# Initialize the "best solution"
population = init_pop(pop_size, chromosome_length)

Results and Visualization

best_solution, best_score, states = SGA(
    population, num_gen, num_parents, crossover_prob, mutation_prob, use_tqdm=True

print(f"Best Solution: {best_solution}")
print(f"Best Score: {best_score}")
Best Solution: 230
Best Score: 2180000.0

We can also easily verify that the solution is indeed 230, as the maximum of the profit function (solution to the derivative) is 230:

\(\frac{dP}{dProfit}=-400P + 92000\)
Set \(\frac{dP}{dProfit}=0\)
\(400P = 92000\)

Example: Bike pricing using Real-valued Genetic Algorithm

We can also run the above example using real-values instead. For this example, we’ll use a solver to handle the population generation, mutation, and child creation.

import numpy as np
from ypstruct import structure
import math
import matplotlib.pyplot as plt
from tqdm.notebook import tqdm
# run Genetic Algorithm
def run(problem, params):
    # Problem Information
    costfunc = problem.costfunc
    nvar = problem.nvar
    varmin = problem.varmin
    varmax = problem.varmax

    # Parameters
    maxit = params.maxit
    npop = params.npop
    beta = params.beta
    pc = params.pc
    nc = int(np.round(pc*npop/2)*2)
    gamma = params.gamma
    mu =
    sigma = params.sigma

    # Empty Individual Template
    empty_individual = structure()
    empty_individual.position = None
    empty_individual.cost = None

    # Best Solution Ever Found
    bestsol = empty_individual.deepcopy()
    bestsol.cost = np.inf

    # Initialize Population
    pop = empty_individual.repeat(npop)
    for i in range(npop):
        pop[i].position = np.random.uniform(varmin, varmax, nvar)
        pop[i].cost = costfunc(pop[i].position)
        if pop[i].cost < bestsol.cost:
            bestsol = pop[i].deepcopy()

    # Best Cost of Iterations
    bestcost = np.empty(maxit)
    # Main Loop
    for it in tqdm(range(maxit), total=maxit):

        costs = np.array([x.cost for x in pop])
        avg_cost = np.mean(costs)
        if avg_cost != 0:
            costs = costs/avg_cost
        probs = np.exp(-beta*costs)

        popc = []
        for _ in range(nc//2):

            # Select Parents
            #q = np.random.permutation(npop)
            #p1 = pop[q[0]]
            #p2 = pop[q[1]]

            # Perform Roulette Wheel Selection
            p1 = pop[roulette_wheel_selection(probs)]
            p2 = pop[roulette_wheel_selection(probs)]
            # Perform Crossover
            c1, c2 = crossover(p1, p2, gamma)

            # Perform Mutation
            c1 = mutate(c1, mu, sigma)
            c2 = mutate(c2, mu, sigma)

            # Apply Bounds
            apply_bound(c1, varmin, varmax)
            apply_bound(c2, varmin, varmax)

            # Evaluate First Offspring
            c1.cost = costfunc(c1.position)
            if c1.cost < bestsol.cost:
                bestsol = c1.deepcopy()

            # Evaluate Second Offspring
            c2.cost = costfunc(c2.position)
            if c2.cost < bestsol.cost:
                bestsol = c2.deepcopy()

            # Add Offsprings to popc

        # Merge, Sort and Select
        pop += popc
        pop = sorted(pop, key=lambda x: x.cost)
        pop = pop[0:npop]

        # Store Best Cost
        bestcost[it] = bestsol.cost

    # Output
    out = structure()
    out.pop = pop
    out.bestsol = bestsol
    print("Best solution: ", bestsol.position)
    out.bestcost = bestcost
    return out
# perform single-point crossover
def crossover(p1, p2, gamma=0.1):
    c1 = p1.deepcopy()
    c2 = p1.deepcopy()
    alpha = np.random.uniform(-gamma, 1+gamma, *c1.position.shape)
    c1.position = alpha*p1.position + (1-alpha)*p2.position
    c2.position = alpha*p2.position + (1-alpha)*p1.position
    return c1, c2
# apply mutation
def mutate(x, mu, sigma):
    y = x.deepcopy()
    flag = np.random.rand(*x.position.shape) <= mu
    ind = np.argwhere(flag)
    y.position[ind] += sigma*np.random.randn(*ind.shape)
    return y
# apply boundary constraints
def apply_bound(x, varmin, varmax):
    x.position = np.maximum(x.position, varmin)
    x.position = np.minimum(x.position, varmax)
# roulette wheel selection
def roulette_wheel_selection(p):
    c = np.cumsum(p)
    r = sum(p)*np.random.rand()
    ind = np.argwhere(r <= c)
    return ind[0][0]
# Fitness function
def bike_pricing(X):
    return 200*math.pow(X,2) - 92000*X + 8400000
# Problem Definition
problem = structure()
problem.costfunc = bike_pricing
problem.nvar = 1
problem.varmin = [50]
problem.varmax = [350]

# Solver GA Parameters
params = structure()
params.maxit = 6000
params.npop = 5
params.beta = 1
params.pc = 1
params.gamma = 0.1 = 0.01
params.sigma = 0.1
# Run GA
out = run(problem, params)
Best solution:  [212.35900406]
# Results
# plt.semilogy(out.bestcost)
plt.xlim(0, params.maxit)
plt.ylabel('Best Cost')
plt.title('Genetic Algorithm (GA)')

Example: Permutation Genetic Algorithm for TSP

As you know by now, the Travelling Salesman Problem is essentially about generating the least costly permutation of all vertices in a graph. Assuming the graph is fully connected (every permutation is feasible), we can consider the following set of operators:

Crossover operators

  1. Adjacency-based operators

    • Partially mapped crossover

    • Edge crossover

  2. Order-based operators

    • Order 1 crossover

    • Cycle crossover

Mutation Operators

  1. Insert mutation

  2. Swap mutation

  3. Inversion mutation

  4. Scramble mutation

import networkx as nx
import matplotlib.pyplot as plt
import random
from tqdm.notebook import tqdm
from smart_mobility_utilities.common import random_tour, cost_tour

Let’s generate a random connected graph, so that any permutation of vertices is feasible.

G = nx.complete_graph(60)
for (u, v) in G.edges():
    G.edges[u,v]['weight'] = random.randint(0,40)

nx.draw(G, with_labels=True)

Let’s assume we can generate two random tours of the graph, and we now want to generate another permutation from these two parents. We can’t simply cut and paste or swap randomly, as that will likely generate a set that is not an actual and admissible permutation. For this reason, we need to consider the following crossover types that guarantee an acceptable permutation:

Partially-mapped crossover

This algorithm is lightweight and disrupts a lot of the genes, which is a desirable effect for GA.

def PMX_crossover(firstPermutation, secondPermutation):
    # we need to know the length of either permutation
    # they must be equal in size
    length = len(firstPermutation)
    # (1) choosing the two crossover points
    #     by randomly select a point from the
    #     first half and another point from the second half
    first_Cross = random.randint(0, length // 2)
    second_Cross = random.randint(length // 2 + 1, length - 1) + 1
    # (2) initializing the two equal sized segments
    #     and create another array with the same size of 
    #     any permutation to be the child
    child = [None] * length
    subP1 = firstPermutation[first_Cross:second_Cross]
    subP2 = secondPermutation[first_Cross:second_Cross]
    # (3) copy the elements in the segment from the first permutation
    #    into the same segment in the child
    child[first_Cross:second_Cross] = subP1
    # (4) finding common elements in the segment from the
    #     the first permutation and the second permutation
    #     and get its mirror from first permutation to second
    pairs = list()
    for element in subP2:
        if element not in subP1:
            pairs.append((element, subP1[subP2.index(element)]))
    # (5) copying into the child all the elements in the segment
    #.    that are present in the first permutation segment but
    #.    aren't present in the second permutation segment.
    #.    if not we need to copy that element in place outside
    #     the segment in a place where we are sure that would
    #.    result into inadmissible permutations.
    for pair in pairs:
        second = pair[1]
        if second not in subP2:
            index = secondPermutation.index(second)
            child[index] = pair[0]
            # when there is an element from the segment of the first
            # permutation in the segment of the second permutation 
            reflect = firstPermutation[secondPermutation.index(second)]
            # bouncing back and forth between the two arrays indices
            # to get out of second permutation segment
            while reflect in subP2:
                bounce = reflect
                reflect = firstPermutation[secondPermutation.index(bounce)]
            child[secondPermutation.index(reflect)] = pair[0]
    # (6) go through all the elements that have not been assigned
    #     yet in the child array and assign them with the second permutation
    #     elements
    for i in range(length):
        if child[i] == None:
            child[i] = secondPermutation[i]
    return child

Edge recombination crossover

The offspring of this crossover depends on the edges and connections between nodes in each permutation.

def ERO_crossover(G, firstPermutation, secondPermutation):
    # constructing edge table
    edgeTable = dict()
    elements = firstPermutation[:]
    length = len(elements)
    # just like adjacency list of nodes
    # in a given graph, but it is actually
    # the result of union between the two given
    # adjacency lists of a certain parent from both graphs
    for source in elements:
        edgeTable[source] = list()
        firstPermutationAdj = G[firstPermutation.index(source)]
        secondPermutationAdj = G[secondPermutation.index(source)]
        adjList = list(set().union(firstPermutationAdj, secondPermutationAdj))
        edgeTable[source]= adjList

    child = list()
    parent = random.choice(elements)
    # terminate when the length of the child is the same
    # as the length of their parent
    while len(child) < length:

        # remove the parent from all the adjacency lists
        for adjList in edgeTable.values():

        parentAdjList = edgeTable[parent][:]
        del edgeTable[parent]

        if len(parentAdjList) == 0: continue
        parent = min(parentAdjList, key = lambda parent : len(edgeTable[parent]))

    return child

Order 1 crossover

This is much simpler crossover technique. It involves copying a random segement of nodes from the first parent, and completes the child with any nodes in the second parent not already present in the child.

def ordOne_crossover(firstPermutation, secondPermutation):
    length = len(firstPermutation)
    # choose the start and the end of the segment 
    # to be copied from the first parent
    start_Segment = random.randint(0, length // 2)
    end_Segment = random.randint(length // 2 + 1, length - 1) + 1
    # create a child
    child = list()
    # add the randomaly selected segment from the first parent
    child.extend(firstPermutation[start_Segment: end_Segment])
    # add what is left from the second parent that wasn't added from the first parent
    residueFromSegment = list(set(secondPermutation) - set(firstPermutation[start_Segment: end_Segment]))
    return child

Insert Mutation

This mutation selects two random genes and moves the second to follow the first directly. This ensures a valid permutation.

def insert_mutation(permutation):
    # copying the list so we don't mess with the original
    child = permutation[:]
    # choose two random genes and make sure that they are different
    first_gene = random.choice(child)
    second_gene = random.choice(child)
    while first_gene == second_gene:
        first_gene = random.choice(child)
        second_gene = random.choice(child)
    # removing the second gene from the list and insert it just after the first
    geneNewIndex = child.index(first_gene) + 1
    child.insert(geneNewIndex, second_gene)
    return child

Swap Mutation

This mutation simply exchanges the positions of two nodes in a permutation.

def swap_mutation(permutation):
    # copying the list so we don't mess with the original
    child = permutation[:]

    length = range(len(child))

    # choose two random gene position so they could be swaped
    first_gene_pos, second_gene_pos = random.sample(length, 2)

    # swapping 
    child[first_gene_pos], child[second_gene_pos] =\
    child[second_gene_pos], child[first_gene_pos] 

    return child


For this example, we’ll use Partially-mapped crossovers and Insert mutations, but others could be substituted as well.

# remember that ERO takes the graph as input
# not like PMX or Order 1
crossover = PMX_crossover
mutate = insert_mutation

size_of_population = 100
ngen = 500
best_at_gen = [] # for keeping track of the best tour at a given generation

# Initialize population
pool = [*random_tour(G.nodes, number_of_perms=size_of_population)]
for generation in tqdm(range(ngen)):
    # 1- crossover every consecutive pair of routes
    # 2- replace the weakest of the two parent with the product of the crossover
    # 3- mutate the whole pool
    # 4- repeat and save the value best tour in that generation
    # 1-
    for parent1, parent2 in zip(pool, pool[1:]):
        child = crossover(parent1, parent2)
        # 2-
        if cost_tour(G, parent1) > cost_tour(G, parent2):
            pool[pool.index(parent2)] = child
            pool[pool.index(parent1)] = child
    # 3-
    for i in range(size_of_population):
        pool[i] = mutate(pool[i])
    # 4-
    best_at_gen.append(cost_tour(G, min(pool, key = lambda tour : cost_tour(G, tour))))
tour = min(pool, key = lambda route : cost_tour(G, route)) # result
tour_cost = cost_tour(G, tour) #result cost

plt.xlabel("number of generation")
plt.ylabel("cost of the shortest tour (meters)")

As you may notice, there is not much convergence in the costs of the tours in this GA implementation. This is largely because the solutions of the problem are very close to each other (there is only a limited range of edge weights).

Example: Permutation Genetic Algorithm for Routing Problem

For the examples in this section, we will move past our University of Toronto search space in the previous section in favour of a more complex environment. This is the kind of network which really benefits from Genetic Search Algorithms.

Let’s consider the road network in Vaughan, a municipality to the north of Toronto:

import osmnx

G = osmnx.graph_from_address('vaughan', dist=1400)

origin = 29658954
destination = 701446851

highlighted = [origin, destination]

nc = ['r' if node in highlighted else '#336699' for node in G.nodes()]
ns = [50 if node in highlighted else 8 for node in G.nodes()]
fig, ax = osmnx.plot_graph(G, node_size=ns, node_color=nc, node_zorder=2)

For this example, we’ll be attempting to search for the shortest route between two points highlighted in red.

The Algorithm

There are many ways to design a genetic algorithm; the below is only one of them.

  1. We will need a fitness function to determine which offspring to keep in each generation. This will be the length of the route generated (we want to minimize this).

  2. As we will be doing a real-value (non-binary) GA, out phenotype is the same as our genotype.

  3. Our GA implementation is “kind of” steady state: we probabilistically choose the best routes to be the parents of the next generation. More on this later.

  4. We will be doing 1-point crossover.

  5. Route mutations take place by deleting a random number of genes/nodes and trying to stitch that gap. Keep in mind this is time-intensive.

  6. Selection is based on fitness-proportionate criteria.

The actual algorithm (in pseudo-code) looks like this:

GENETIC_ALGORITHM(source,destination, num_of_generations. pool_size) return a route
poolrandom-routes between source and destination

for num_of_generations do
parents_first_genselect number of best routes from pool
parents_second_gencrossover parents_first_gen
parentsmutate parents_second_gen
pooloffspring of parents
remove duplicates from pool and add random routes to compensate any removal
route ← best route in pool
return route


Our algorithm “mutates” a child (route) by deleting a node between two nodes in the child, and attempting to fill in the gap with alternate nodes. If you recall the way we generated “children” for Beam Search and Hill climbing in previous sections, this uses the same idea.

from smart_mobility_utilities.children import shortest_path_with_failed_nodes_single
import random
import math

def mutate(G, route):
    source = route[0]
    destination = route[-1]

    failed = random.choice(route)

    path = shortest_path_with_failed_nodes_single(G, route, [failed])

    # This method could fail because of a lot of factors relating to the graph structure
    # Check the documentation fo the shortest_path_with_failed_nodes to learn more
    while path == math.inf:
        failed = random.choice(route)
        path = shortest_path_with_failed_nodes_single(G, route, [failed])
    return path


Rather than selecting a random point to “crossover”, we look for a common node between the two routes.

import itertools
from smart_mobility_utilities.common import probability

def cross_over(route_1, route_2):
    origin = route_1[0]
    destination = route_1[-1]

    intersection = [*itertools.filterfalse(\
                    lambda element : element in [origin, destination] ,\
                    list(set(route_1) & set(route_2)))]
    if len(intersection) == 0: return route_1 # if there is not common node, just return the first route

    cross_over_point = random.choice(intersection)
    first_point = route_1.index(cross_over_point)
    second_point = route_2.index(cross_over_point)

    if probability(0.50):
        return route_1[:first_point] + route_2[second_point:]
        return route_2[:second_point] + route_1[first_point:]


Our offspring will likely result in many duplicates, as most routes will closely ressemble each other. To account for this, we remove any duplicates and replace them with new randomly generated routes. This also ensures population diversity.

The code

from smart_mobility_utilities.common import randomized_search

# Configure some parameters
n_gen = 100 # Anything above 20 will take some time but yield much better results
pool_size = 12 # Number of routes in each generation
parents_num = 4 # needs to be a factor of pool size

# Initialize the pool
pool = [randomized_search(G,origin,destination) for _ in range(pool_size)]

# Plot the pool on a map
random_hexa = lambda: random.randint(0,255) # generate random hexadecimal color
rc = ['#%02X%02X%02X' % (random_hexa(),random_hexa(),random_hexa()) for _ in range(pool_size)]
fig, ax = osmnx.plot_graph_routes(G, pool, route_colors=rc, route_linewidth=6, node_size=0)

The following code will take quite a bit of time to run.

from tqdm.notebook import tqdm
from smart_mobility_utilities.common import cost, flatten
from smart_mobility_utilities.children import get_children
import heapq

states = []

def select_best(pool, num_of_choices, probability_dist):
        return random.choices(population=pool, weights=probability_dist, k= num_of_choices)

for gen in tqdm(range(n_gen)):
    weights = [cost(G, route) for route in pool]
    parents_1 = select_best(pool, parents_num, weights)
    parents_2 = [cross_over(route_1, route_2) for route_1, route_2 in itertools.combinations(parents_1, r = 2)]
    pool.extend([mutate(G, route) for route in parents_2])
    pool = [*map(list, list(set(map(tuple, pool))))]
    num_removed = pool_size - len(pool) + 1
    pool.extend([randomized_search(G, origin, destination) for _ in range(num_removed)])
    pool = heapq.nsmallest(pool_size,pool,key=lambda x: cost(G,x))
    m = cost(G, min(pool, key = lambda route : cost(G, route)))

# Retrieve the final best route
route = min(pool, key=lambda route: cost(G,route))
from smart_mobility_utilities.viz import draw_route
import matplotlib.pyplot as plt
print("Cost of the route:",cost(G,route))
ax = plt.plot(states)
Cost of the route: 2765.76
draw_route(G,route, zoom=13)