As with search algorithms, I found learning these a little tricky so I’ve put my notes and the code I generated in interactive lessons in one place for easy reference.

#### Bubble

Bubble sort compares adjacent values in a dataset and then swaps them if required, iterating across the values in full until no more swaps are necessary (the data is sorted). This can be time/resource intensive with large datasets. Bubble sorting can be faster when applied to data which is already mostly ordered (‘a quick tidy up’).

After each iteration of (ascending value) Bubble Sort, the highest number will always be at the end of the dataset. Therefore, the code can be optimised (below) by disregarding this final comparison in the next iteration, making the number of comparisons per iteration decrementally fewer.

def bubble_sort(arr): for i in range(len(arr)): for idx in range(len(arr) - i - 1): if arr[idx] > arr[idx + 1]: arr[idx], arr[idx + 1] = arr[idx + 1], arr[idx]

#### Merge

Merge sorting essentially breaks the data down into individual elements and then reconstructs as a sorted list. It does this by splitting the dataset in half, repeating the split until left with single elements (lists with only one entry!). It then compares pairs of elements, reconstructing them into sorted lists of two elements. Pairs of these two-element lists are then reconstructed into sorted lists of four elements, and so on until a sorted list of the original length is reconstructed.

The Merge code below features two functions, the sorter and the merger.

def merge_sort(items): if len(items) <= 1: return items middle_index = len(items) // 2 left_split = items[:middle_index] right_split = items[middle_index:] left_sorted = merge_sort(left_split) right_sorted = merge_sort(right_split) return merge(left_sorted, right_sorted) def merge(left, right): result = [] while (left and right): if left[0] < right[0]: result.append(left[0]) left.pop(0) else: result.append(right[0]) right.pop(0) if left: result += left if right: result += right return result

#### Quicksort

Similar to merge, but takes in a specified (or random) ‘pivot’ value to generate the ‘split point’ for each iteration. The algorithm must track the position of the pivot, and the point in the iterating dataset where all preceding values are lower than the pivot (for an ascending sort).

from random import randrange, shuffle def quicksort(list, start, end): if start >= end: return # select random element to be pivot pivot_idx = randrange(start, end + 1) pivot_element = list[pivot_idx] # swap random element with last element in sub-lists list[end], list[pivot_idx] = list[pivot_idx], list[end] # tracks all elements left (less than) pivot less_than_pointer = start for i in range(start, end): if list[i] < pivot_element: # swap element to the right-most portion of lesser elements list[i], list[less_than_pointer] = list[less_than_pointer], list[i] # move less than pointer less_than_pointer += 1 # move pivot element to the right-most portion of lesser elements list[end], list[less_than_pointer] = list[less_than_pointer], list[end] # recursively sort left and right sub-lists quicksort(list, start, less_than_pointer - 1) quicksort(list, less_than_pointer + 1, end)

#### Radix

Unlike the other searching algorithms, Radix does not perform any comparisons. It is a numeric sort, and operates by siloing data according to their least significant digit (or most, depending on sort). Having done that, the data can now be siloed again according to the next significant digit and so on until data has been siloed according to the most significant digit of the longest number in the data. At this point, the data will have been sorted!

def radix_sort(to_be_sorted): maximum_value = max(to_be_sorted) max_exponent = len(str(maximum_value)) #operate on a copy of the original list as sort will clear data! being_sorted = to_be_sorted[:] for exponent in range(max_exponent): position = exponent + 1 index = -position digits = [[] for i in range(10)] for number in being_sorted: number_as_a_string = str(number) try: digit = number_as_a_string[index] digit = int(digit) except IndexError: digit = 0 digits[digit].append(number) being_sorted = [] for numeral in digits: #appends every value of one list to another list being_sorted.extend(numeral) return being_sorted

#### Bubble vs Quicksort

The lesson project featured coding the sort algorithms to provide different sort orders. The dataset provided in the lesson demonstrated the noticeable difference in processing time for Bubble vs Quick sorts on a bulky dataset. There are three scripts.

**utils.py** – for importing the data

import csv def load_books(filename): bookshelf = [] with open(filename) as file: shelf = csv.DictReader(file) for book in shelf: book['author_lower'] = book['author'].lower() book['title_lower'] = book['title'].lower() bookshelf.append(book) return bookshelf

**sorts.py** – providing the core logic for the sort algorithms

import random def bubble_sort(arr, comparison_function): swaps = 0 sorted = False while not sorted: sorted = True for idx in range(len(arr) - 1): if comparison_function(arr[idx],arr[idx + 1]): sorted = False arr[idx], arr[idx + 1] = arr[idx + 1], arr[idx] swaps += 1 print("Bubble sort: There were {0} swaps".format(swaps)) return arr def quicksort(list, start, end, comparison_function): if start >= end: return pivot_idx = random.randrange(start, end + 1) pivot_element = list[pivot_idx] list[end], list[pivot_idx] = list[pivot_idx], list[end] less_than_pointer = start for i in range(start, end): if comparison_function(pivot_element, list[i]): list[i], list[less_than_pointer] = list[less_than_pointer], list[i] less_than_pointer += 1 list[end], list[less_than_pointer] = list[less_than_pointer], list[end] quicksort(list, start, less_than_pointer - 1, comparison_function) quicksort(list, less_than_pointer + 1, end, comparison_function)

**script.py** – where the comparator functions were defined

import utils import sorts bookshelf = utils.load_books('books_small.csv') long_bookshelf = utils.load_books('books_large.csv') def by_title_ascending(book_a, book_b): return book_a['title_lower'] > book_b['title_lower'] def by_author_ascending(book_a, book_b): return book_a['author_lower'] > book_b['author_lower'] def by_total_length(book_a, book_b): return (len(book_a['title']) + len(book_a['author'])) > (len(book_b['title']) + len(book_b['author']))