CMSC 351
Algorithms
Alex Reustle
Dr. Clyde Kruskal Spring 2016 University of Maryland
http://www.cs.umd.edu/class/spring2016/cmsc351-0101
Last Revision: May 9, 2016
Table of Contents
1 Maximum Subarray Problem 4
Brute Force solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Analysis of the Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Methods to improve integer list summation algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Timing analysis 5
Classes of algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Θ(n
2
) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Θ(n log(n)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Bubble Sort Analysis 6
Average Case Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4 Insertion Sort 6
Analysing Number of Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Best Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Worst Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Average Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Analyzing Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Best Case Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Worst Case Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Average Case Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5 Insertion Sort without Sentinel 8
Best Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Worst Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Average Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
6 Selection Sort 10
Best Case Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Exchanges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1
7 Merge Sort 10
Merge Comparison analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Equal length arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Different length arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Mergesort Comparison Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Merge Sort Recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
8 Integer arithmetic 12
Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Problem size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Better Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
9 Recursion Master Formula 14
Applying to known algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Better Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Mergesort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
10 Heapsort 16
Create Heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Heap Creation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Finish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
11 Finding Bounds to summation functions 20
Mathematical Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Geometric Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Gauss’s Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Harmonic Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Using continuous math to solve discrete math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Gauss’s Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Harmonic Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
12 Order Notation 23
13 Quicksort 25
Comparison Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Quicksort Worst case comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Quicksort Best case comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Proof by Constructive Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Quicksort Worst Case from Recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Quicksort Best Case from Recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Average case analysis of quicksort from approximate recurrence. . . . . . . . . . . . . . . . . . . . . . . 27
Average case analysis of quicksort from exact recurrence. . . . . . . . . . . . . . . . . . . . . . . . . . . 28
New Book method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Is Quicksort In place? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Randomized pivot selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3 Median Pivot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Small Size Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
14 Algorithm Strengths and Weaknesses 30
Memory Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2
15 Linear Sorting Algorithms 30
Re-evaluating Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Find biggest element in array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Find second largest element in array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Find k largest elements in array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Natural approach: Hold a Tournament . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Finding Best and Worst . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
16 Counting Sort 31
17 Radix Sort 32
Analysis of running time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
18 Bucket Sort 33
19 Selection 33
Analyzing the recurrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Lower Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Constrained case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Average case Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
20 Finding Median explicitly during selection 35
Worst Case analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
21 Graph Algorithms 36
Memory Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Edge Lookup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
List all Edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Take aways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Changing the number of nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Adding Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Removing Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Depth First Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Breadth First Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Identifying Connected Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Bridges of Köningsberg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
22 Shortest Path Algorithms 38
Dijkstra’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Correctness Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
23 NP-Completeness 39
Decision Problem Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
The Class of NP problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Problems between P and NP-Complete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Exam Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Formula SAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Graph Coloring Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3
1 Maximum Subarray Problem
Given a sequence of integers, find the largest collection which is a sum of adjacent values.
3 5 4 3 6 4 8
| {z }
Sum=13
5 3 7 2 1
Brute Force solution
Brute force solution: try every possible sum. (I’d guess it’s
n
3
) Sum variable M set to zero. We will allow empty
sums (no elements, sum = 0).
1 M 0;
2 for i=1 to n do
3 for j=i to n do
4 S 0;
5 for k=i to j do
6 S = S + A[k];
7 M = max(M, S);
8 end
9 end
10 end
Algorithm 1: Matrix Subarray Brute Force
Insight: Loops in programs are like summation in Mathematics.(for i=1) to n ==
P
n
i=1
Nested loops are nested summations.
P
n
i=1
P
n
j=i
P
j
k=i
1
Simplify summation using summation algebra from the inside out.
P
n
i=1
P
n
j=i
j i + 1
Tangent: since j is the variable, i and 1 are constants. So it can be separated into two sums.
Analysis of the Algorithm
Actual progress: Change a variable to manage sum. Like integrals in calculus. Every technique in calculus for
integrations are matched by similar techniques in summations. Change of variable by example.
n
X
i=1
n
X
j=i
j
X
k=i
1 =
n
X
i=1
n
X
j=i
j i + 1 By adding 1 j i times, and fenceposting
=
n
X
i=1
ni+1
X
j=1
j By change of variable Method
=
n
X
i=1
(n i + 1)(n i + 2)
2
Gausses sum
=
1
2
×
n
X
i=1
(n i + 1)(n i + 2) Expanding is error prone. Change variable instead
=
1
2
×
n
X
ni+1=1
i(i + 1) substitute i=n-i+1
=
1
2
×
n
X
i=1
i(i + 1) simplify bounds
=
1
2
×
n(n + 1)(n + 2)
3
magic.
=
n(n + 1)(n + 2)
6
This first method is Θ
n
3
4
Methods to improve integer list summation algorithm.
Remember previous sums, do not recalculate known data.
1 M 0;
2 for i=1 to n do
3 S 0;
4 for j = i to n do
5 S = S + A[i];
6 M = max(M, S)
7 end
8 end
Algorithm 2: n
2
Subarray
n
X
i=1
n
X
j=i
1 =
n
X
i=1
n i + 1 =
n
X
i=1
i =
n(n + 1)
2
This new method is Θn
2
.
Loosely speaking the maximum sum is just the previous maximum sum, plus the current value.
1 M 0, S 0;
2 for i = 1 to n do
3 S max(S + A[i], 0);
4 M = max(M, S);
5 end
Algorithm 3: Linear Subarray
Correctness of the algorithm stems from proof by induction on S = max(S + A[i], 0). This is the Loop Invariant.
n
X
i=1
1 = n
2 Timing analysis
Lower Bound
No algorithm is Faster than this boundary condition, the lower bound time.
Best Case scenario, very hard to prove in the general case.
Upper Bound
Some Algorithm can obtain upper bound time.
Worst case Scenario.
Classes of algorithms
Θ(n
2
)
Bubble sort
Insertion Sort
Selection Sort
Θ(n log(n))
Merge Sort
Heap Sort
Quick Sort
Others
Radix Sort
5
3 Bubble Sort Analysis
1 for i = n downto 2 do
2 for j = 1 to i-1 do
3 if A[j] > A[j + 1] then
4 swap (A[j],A[j+1]);
5 end
6 end
7 end
Algorithm 4: Bubble Sort
Major Growth Elements of Bubble sort will be Comparisons and Exchanges.
Best case number of comparisons require you to view every element unconditionally.
n
X
i=2
i1
X
j=1
1 =
n
X
i=2
i 1
=
n1
X
i=1
(i + 1) 1
=
n1
X
i=1
i
=
(n 1)n
2
Comparisons will always occur, regardless of state of list.
Best Case Exchanges occur when the list is sorted. Zero exchanges.
Worst Case Exchanges occur when the list is reverse sorted. Same number of exchanges as comparisons.
Definition 3.1 (Random). Each permutation (ordering) of the list is equally likely.
Rotations of list permutations are equally likely to be in any position, but are not random because they ignore other
possible permutations.
Average Case Analysis
Count Transpositions. Two elements that are out of order related to each other.
In the best case there are no transpositions.
In the worst case there are n choose 2 transpositions. Every element is out of order with the number of elements
below it. So the number of transpositions is the number of unique pairs. Which is n choose 2 pairs.
In a randomized sample the permutations are equally likely to be out of order greater or lesser, so the number of
average case exchanges will be half the number of comparisons.
n(n1)
4
.
4 Insertion Sort
Every 0 to [i-1] index during the loop will be sorted.
ith term is stored in tmp. Values in the sorted subarray which are larger than the tmp value are brought forward
individually until tmp > A[j].
6
1 A[0] −∞;
2 for i = 2 to n do
3 t A[i];
4 j = i 1;
5 while A[j] t do
6 A[j + 1] A[j];
7 j−−;
8 end
9 A[j + 1] t;
10 end
Algorithm 5: Insertion Sort
Analysing Number of Comparisons
Best Case Comparisons
In the case where the array is already correctly sorted, there will be only 1 comparison for each iteration of the other
for loop. It will always evaluate false. So the number of best case comparisons is just the number of for loop iterations.
n
X
i=2
1 = (n 2) + 1 Already sorted
= n 1
Worst Case Comparisons
In the worst case the array is reverse sorted, and every element must be moved. The while loop always decrements j
to zero to compare against the sentinel value.
n
X
i=2
i =
n
X
i=1
i
!
1 Reverse Sorted
=
n(n + 1)
2
1 must remove i=1
=
(n + 2)(n 1)
2
Average Case Comparisons
In the average case we must determine the probability that a given element will move. So for evaluation we want the
expected value over the set
P
xX
P (x) · V (x).
Starting at location i, go until reach location j-1. Probability that 1 element (A[i]) will end up in any location in the
subarray is
1
i
Value is number of moves. Which is (i-j+1).
n
X
i=2
i
X
j=1
1
i
· (i j + 1) =
n
X
i=2
1
i
i
X
j=1
(i j + 1) Pull out 1/i
=
n
X
i=2
1
i
i
X
j=1
j
=
n
X
i=2
1
i
·
i(i + 1)
2
=
n
X
i=2
(i + 1)
2
=
1
2
"
n
X
i=2
i + 1
#
7
=
1
2
"
n
X
i=2
i +
n
X
i=2
1
#
=
1
2
"
n
X
i=1
i 1 + (n 1)
#
=
1
2
(n)(n + 1)
2
1 +
2(n 1)
2
=
1
2
(n
2
+ n 2)
2
+
2n 2
2
=
(n
2
+ 3n 4)
4
=
(n + 4)(n 1)
4
Analyzing Exchanges
Best Case Exchanges
Best case number of moves = 2
n
1. There is 1 move in in the
−∞
assignment at the top, then in the for loop there
are 2 moves per iteration, moving A[i] into t, and moving it back into the same spot. 1 + 2(n 1) = 2n 1.
Worst Case Exchanges
Worst case number of moves is 2 for each iteration of outer loop plus worst case number of comparisons, minus the time
when the comparison is false at the end, of the inner loop, plus 1 for the top assignment. 2(
n
1)+
COMP
(
n
1)+1.
Note the short cut, at each iteration we’re doing one more move than comparison. So just easily take the value
already derived for Worst case comparisons, and add the number of loop iterations, plus 1 for the sentinel assignment.
(n + 2)(n 1)
2
+ n
Average Case Exchanges
In the average case, whenever there’s a comparison, there will always be a move, except the 1 time per iteration it
evaluates false. So the analysis method is the same as for the worse case. Since we know the number.
(n + 4)(n 1)
4
+ n
Remark 4.1.
An important trick is to recognize that the moves are related to the comparisons 1 to 1 and that there
will be 1 time when the comparison evaluates false each iteration, so you subtract that from the final analysis.
5 Insertion Sort without Sentinel
Add another comparison in the algorithm so the A[0] term is unnecessary, by checking that you haven’t gone past i=1.
1 for i=2 to n do
2 t A[i];
3 j = i 1;
4 while j > 0 A[j] > t do
5 A[j + 1] A[j];
6 j−−;
7 end
8 A[j + 1] t;
9 end
Algorithm 6: Insertion Sort w/o Sentinel
A note about comparisons. We don’t count index value comparisons, because index values can easily be stored in
registers and won’t have the same cost to check as array value comparisons.
8
Best Case Comparisons
In the best case, the array is already sorted, so there will just be 1 comparison per iteration of the outer loop.
P
n
i=2
1 = n 1
Worst Case Comparisons
In the worst case the array is reverse sorted, and every ith iteration of the loop must compare against all previous
(i 1) elements.
n
X
i=2
i1
X
j=1
1 =
n
X
i=2
i 1
=
n
X
i=2
i
n
X
i=2
1
=
n(n + 1)
2
1 (n 1)
=
n(n + 1)
2
n
=
n
2
+ n
2
2n
2
=
n
2
n
2
=
n(n 1)
2
Average Case Comparisons
Remark 5.1.
HARMONIC SERIES and brick problem. How far out can you stack bricks on top of one another?
Arbitrarily far out (but not infinitely far out)
H
n
=
n
X
i=1
1
i
= 1 +
1
2
+
1
3
+
1
4
+
1
5
+ ··· ln n + 1
H
n
1 =
n
X
i=2
1
i
=
1
2
+
1
3
+
1
4
+
1
5
+ ··· ln n
Harmonic Series come up again and again because of the idea that something has a
1
i
chance of happening in an
iteration.
What are we saving when we don’t have a sentinel? In the event that the ith element is the smallest element in the
A[1..i-1] sub-array, we will descend all the way to the 1st index of the list. If that happens, we won’t have the 1 extra
step of comparing against the sentinel. So only when the ith element is the smallest seen thus far do we save 1 step.
Probability that current ith element is smallest in sub array is
1
i
. The value when that occurs is 1. Thus over all n
elements of the array, on average the sentinel would have cost us
P
n
i=2
1
i
= H
n
1 comparisons.
So average Case comparisons without sentinel is average case with sentinel minus average cost of sentinel.
(n1)(n+4)
4
(H
n
1)
(n1)(n+4)
4
ln n.
Exchanges
Removing the sentinel from insertion sort adds no new exchanges, nor does it alter when a change might have been
done already. Best, worst and average cases are all the same, therefore.
9
6 Selection Sort
Summary: Find biggest element, put at bottom of list, recurse for sub array.
1 for i=n downto 2 do
2 k 1;
3 for j=2 to i do
4 if A[j] > A[k] then
5 k j;
6 end
7 end
8 swap (A[k], A[i]);
9 end
Algorithm 7: Selection Sort
Best Case Comparisons
The comparison is always run, so the comparison value should be the same for each case.
n
X
i=2
i
X
j=2
1 =
n
X
i=2
(i 1)
=
(n 1)n
2
Exchanges
Exchanges don’t occur inside the conditional, so exchanges number is simply n-1.
How many times do we exchange a number with itself? How many times does i=k?
P
1/i = H
n
7 Merge Sort
Divide and Conquer. Recursively split array down to 1, then merge sorted sublists. First call of Mergesort is (A,1,n)
1 MERGESORT (A,p,r);
2 //p and r are indecies in array defining sub array.;
3 if p<r then
4 q b
p+r
2
c;
5 MERGESORT (A,p,q);
6 MERGESORT (A,q+1,r);
7 MERGE (A, (p,q), (q+1,r));
8 end
Algorithm 8: Merge Sort
Merge Comparison analysis
Equal length arrays
Merge algorithm is linear over 2 sorted lists of same size. Total number of comparisons in the worst case during merge
is 2n 1, best case comparisons is n, where 1 array is strictly greater than the other.
Can’t do better than 2
n
1 in worst case, because worst case situation is 2 lists with interleaved values. In this case
the granularity of difference is on the scale of 1 value. So you must check every value to ensure against that.
This gives us very tight bounds and lets us know everything there is to know about merging.
10
Different length arrays
Arrays of size m and n where m n Worst case comparisons = m + n 1
When
m << n
you can get close to
log n
running time using binary search. But when m is close to n you are much
closer to the m + n 1 worst case.
Mergesort Comparison Analysis
Exact Number of Comparisons During a merge
(q p + 1) + (r (q + 1) + 1) 1 = (r p) comparisons
starting at index p and ending at index q. So for the full list, where p=1 and q=n a MERGESORT will start with
(n-1) comparisons and have the following behavior.
n 1
n
2
1
n
2
2
1
.
.
.
n
2
k
1
0 0
n
2
k
1
0 0
.
.
.
n
2
2
1
.
.
.
.
.
.
n
2
1
n
2
2
1
.
.
.
.
.
.
n
2
2
1
.
.
.
.
.
.
n
2
k
1
n
2
k
1
0 0
=
0n
2
k
n
2
k
1
.
.
.
2
2
n
2
2
1
2
n
2
1
n 1
P
lg n
i=0
2
i
(
n
2
i
1)
+
+ + +
+ +
······
+
+ + ++
······
+
+
+
+
+
=
Because this gives
lg n
terms, we can make it a little easier by dropping the bottom row, which sums to 0 anyway,
because it’s 0n.
lg n1
X
i=0
2
i
(
n
2
i
1) =
lg n1
X
i=0
(n 2
i
)
=
lg n1
X
i=0
n
lg n1
X
i=0
2
i
= (n lg n) (2
lg n1+1
1) geometric series
= (n lg n) (n 1)
= n lg n n + 1
Merge Sort Recurrence
When expressed as a recurrence relation, the mergesort algorithm given previously behaves like
S(n) = S
n
2
+ S
n
2
+ n 1
= 2 · S
n
2
+ n 1
11
Which gives the following recurrence relation
S(0) = S(1) = 0
S(n) = S
b
n
2
c
+ S
d
n
2
e
8 Integer arithmetic
Addition
Just like in elementary addition, the time to add 2 n digit numbers is proportional to the number of digits. Θ(
n
).
Since each digit must be used to compute the sum this is the right order for best case time.
+
1 1
1 2 3 4
5 6 7 8
6 9 1 2
We assume that each digital additions take constant time (with carry also, so ignore that little trouble spot) and label
this constant α
Problem size
If we were attempting to determine if a number was prime we would divide it by consecutive primes up to the square
root of that number. In Computer Science we want to approach all problems in terms of the problem size. So we
should look at both in terms of the NUMBER OF DIGITS, and not the size of the number. So while the prime
identification problem is Θ
N
where N is the number, expressed in binary as 2
n
, n being the number of binary digits.
Θ(2
n
2
). Giving an exponential time algorithm.
Multiplication
Elementary school approach to multiplication runs in
n
2
time because every digit on top is multiplied by every digit
on the bottom.
1 2 3 4
5 6 7 8
9 8 7 2
8 6 3 8
7 4 0 4
6 1 7 0
7 0 0 6 6 5 2
×
By memorizing the numbers in large bases, say base 100, 2 digit decimal numbers can be treated as 1 digit numbers
in base 100. That makes the multiplication easier, because there are fewer steps.
To generalize this for n digit numbers, treat them as base 10
n
2
and cut each in half. Each has
n
2
digits, and we know
the base. Then, recurse through this method until we reach a base we really have memorized the table for. Then
multiply them as 2 2 digit numbers, and pull out the answer.
n
×
n
n
2
n
2
×
n
2
n
2
ab
×
cd
a
b
×
c
d
bd
bc
ad
}αn time
ac
+
ac
bd
+
ad+bc
αn time
Total add time = 2αn
ac
is an n digit number, as are
ad
,
bc
,
bd
. When adding,
bd
is not shifted, but
ad
, and
bc
are shifted by
n
2
digits, and
ac
is shifted by n digits. Since there’s no overlap for
ac
and
bd
, you can add them simply by concatenation.
ad
+
bc
takes αn time. That value plus the center (overlapping) segment of ac + bd is also αn.
12
So without writing out the algorithm:
M
(
n
) = 4
M
n
2
+ 2
αn
which gives us a nice recursion to work on. Let’s assume
the time to multiply in the base case is µ. The base case is when both of our numbers are 1 digit long M(1) = µ.
2αn
2α
n
2
···
2α
n
2
k
µ
µ
µ
µ
···
···
···
···
···
···
···
···
···
4
0
(2αn)
4
1
(2α
n
2
)
.
.
.
4
k
(2α
n
2
k
)
4
lg(n)
µ
Which is fun, I’m having fun.
atomic actions =
lg(n)1
X
i=0
(4
i
(2α
n
2
i
)) + 4
lg(n)
µ
= 2αn
lg n1
X
i=0
4
i
2
i
···
= 2αn
lg n1
X
i=0
2
i
···
= 2αn
2
lg n1+1
1
geometric series
= 2αn (n 1) ···
··· n
lg 4
µ
··· n
2
µ
= 2α
n
2
n
+ µn
2
We’re doing
n
2
multiplications and
n
2
additions, so our application of recursion didn’t improve the running time of
our algorithm. There’s a better way!
Better Multiplication
Because of the distributive law (
a
+
b
)(
c
+
d
) =
ac
+
ad
+
bc
+
bd
. So we can use this property to improve the fast
multiplication algorithm. This will let us replace 1 recursive multiplications with 1 new addition and 1 subtraction, as
we shall see.
w = (a + b)(c + d)
2 ·
n
2
additions, 1n multiplication
= ac + ad + bc + bd
u = ac 1n mult.
v = bd 1n mult.
z = w (u + v)
1n addition, and 1n subtraction.
u v
z
+
By the same logic as the earlier multiplication algorithm,
ac
and
bd
are shifted by n bits, and can be concatenated.
Let
w
= (
a
+
b
)(
c
+
d
), and
z
=
w
(
ac
+
bd
) = (
ad
+
bc
) so to calculate
z
takes 3 Multiplications, 3 additions and
1 subtraction. Let’s treat the subtraction as an addition, 4 additions. As such, the recursion for this algorithm is
T (n) = 3T
n
2
+ 4αn, T (1) = µ.
lg n1
X
i=0
3
i
4α
n
2
i
+ 3
lg n
µ = 4αn
X
3
i
2
i
+n
lg 3
µ
= 4αn ·
3
2
lg n1+1
1
3
2
1
.
.
.
13
= 4αn ·
3
2
lg n
1
1
2
= 8αn
h
n
lg
(
3
2
)
1
i
= 8αn
n
lg 3lg 2
1
= 8αn
n
lg 3
n
lg 2
1
= 8αn
n
lg 3
n
1
= 8α
n
lg 3
n
+ n
lg 3
µ
n
lg 3
[8α + µ]
n
1.57
[8α + µ]
This is O(n
1.57
), a better result than the previous algorithm.
9 Recursion Master Formula
T (n) = aT (
n
b
) + cn
d
, T (1) = f
Kruskal claims this is a better formula than the master theorem in the book, easier to use and more applicable. Let’s
solve using tree method.
cn
d
c
n
b
d
c
n
b
2
d
···
c
n
b
n1
d
f
f
f
f
···
···
···
···
···
···
···
···
···
···
···
···
num rows sum
1
a
a
2
a
n1
a
n
.
.
.
cn
d
ac(
n
b
)
d
a
2
c(
n
b
2
)
d
a
n1
c(
n
b
n1
)
d
fa
log
b
n
You will get log
b
n levels, the branching factor is a.
log
b
n1
X
i=0
a
i
c(
n
b
i
)
d
= cn
d
log
b
n1
X
i=0
a
i
b
i
d
+fn
log
b
a
= cn
d
log
b
n1
X
i=0
a
i
b
d
i
+
.
.
.
= cn
d
log
b
n1
X
i=0
a
b
d
i
= cn
d
(
a
b
d
)
log
b
n
1
a
b
d
1
!
14
= cn
d
n
log
b
(
a
b
d
)
1
a
b
d
1
!
= cn
d
n
log
b
alog
b
b
d
1
a
b
d
1
!
= cn
d
n
log
b
ad
1
a
b
d
1
=
cn
log
b
a
cn
d
a
b
d
1
+
.
.
.
=
cn
log
b
a
a
b
d
1
cn
d
a
b
d
1
+ fn
log
b
a
=
c
a
b
d
1
+ f
· n
log
b
a
cn
d
a
b
d
1
This formula can’t apply to the situation when the value of a geometric sum would be 1. The geometric sum isn’t
applicable in that case. SO this equation isn’t applicable when
a
=
b
d
, we need a special case. When the r term = 1, a
geometric series becomes a simple summation of 1, thus in this case if
a
=
b
d
the formula becomes
cn
d
log
b
n
+
fn
log
b
a
.
In the special case of merge sort, we drop the constant -1 from the non recursion part of the term, and apply the
special case, then we repeat the process with the n dropped, and add the two results.
a 6= b
d
c
a
b
d
1
+ f
· n
log
b
a
cn
d
a
b
d
1
a = b
d
cn
d
log
b
n + fn
log
b
a
The relative sizes of these constant factors will overwhelm in differing circumstances.
a > b
d
T (n)
c
a
b
d
1
+ f
· n
log
b
a
= Θ(n
log
b
a
)
a < b
d
T (n)
cn
d
1
a
b
d
= Θ(n
d
)
a = b
d
T (n) cn
d
log
b
n because d = log
b
a = Θ(n
d
(log
b
n))
Applying to known algorithms
Multiplication
T (n) = 4T (
n
2
) + 2αn, T (1) = µ
a = 4, b = 2, c = 2α, d = 1, f = µ
2α
2 1
+ µ
n
lg 4
2αn
2 1
= (2α + µ) n
2
2αn Θ(n
2
)
= 2α
n
2
n
+ µn
2
Θ(n
2
)
15
Better Multiplication
T (n) = 3T (
n
2
) + 4αn, T (1) = µ
a = 3, b = 2, c = 4α, d = 1, f = µ
4α
3
2
1
+ µ
n
lg 3
4αn
3
2
1
= (8α + µ) n
1.57
8αn Θ(n
1.57
)
Mergesort
When we get
T (n) = 2T (
n
2
) + n 1, T (1) = 0
n term: a = 2, b = 2, c = 1, d = 1
a = b
d
cn
d
log
b
n
1 · n
1
log
2
n = n lg n
-1 term: a = 2, b = 2, c = 1, d = 0
a 6= b
d
c
a
b
d
1
+ f
· n
log
b
a
cn
d
a
b
d
1
1
2
2
0
1
1
n
lg 2
1·n
0
2
2
0
1
= n + 1
Implementation details
This gives us the incredible ability to use the constants of the recurrence into a running time and determine if it
will be greater than another algorithm in constant time. There’s also the Schonhage Strassen algorithm, which is
Θ(n log n log log n).
In real life, the size of T (1) is the word size of your machine, because the atomic multiplication is implemented at
that level. So our atomic base is 2
64
for most machines.
10 Heapsort
Created by J.W.J. Williams Improved by Robert Floyd
Create Heap
Finish
Definition 10.1
(Heap)
.
A binary tree where every value is larger than its children. Equivalently its descendants.
In this class we require that all binary trees are full binary trees.
16
100
70
10
60
90
50
40
80
30
20
Create Heap
Turn a tree into a heap.
The traditional way to create a heap is to insert at the end of the array and sift up. Robert Floyd created a better
way to create the heap. Treat the tree as a recursive heap: each parent is the parent to 2 heaps, and sift it from the
bottom up. Create heap in left, create heap on right, then sift root down, and move up a level.
Heap Creation Analysis
In a binary tree, most nodes are near bottom, so doing bottom up technique most work is done near bottom, and
because you’re near the bottom you don’t have far to go down. So we take advantage of that fact: Most elements are
near the bottom and don’t have far to go.
There are
n
2
leaves. Each doing 0 comparisons.
There are
n
4
parents of leaves. Each doing 2 comparisons.
There are
n
8
grandparents of leaves. Each doing 4 comparisons (2 per level).
There are
n
16
greatgrandparents of leaves. Each doing 6 comparisons (2 per level).
n
2
· 0 +
n
4
· 2 +
n
8
· 4 +
n
16
· 6 +
n
32
· 8 +
n
64
· 10 + ···
= n ·
1
2
+
2
4
+
3
6
+
4
16
+
5
32
+ ···
This sum is:
=
1
2
+
1
4
+
1
8
+
1
16
+
1
32
+ ··· = 1
+
1
4
+
1
8
+
1
16
+
1
32
+ ··· =
1
2
+
1
8
+
1
16
+
1
32
+ ··· =
1
4
+
1
16
+
1
32
+ ··· =
1
8
+
.
.
.
= 2n
So Heap creation does Θ(2n) comparisons
Finish
Put root node at the bottom of the array, as it must be the largest element, so it will go at the end of the sorted
array.
17
Then take the bottom right hand leaf and move to a tmp space.
Then sift, reordering the tree and put the tmp leaf in its proper spot.
Repeat.
Sift comparisons: each level has 2 comparisons, child and tmp. There are
lg n
levels. Total for a sift
2
lg n
This
must be done for each element in the tree, so our heap has an upper bound of 2n lg n.
But the heap shrinks, each iteration removes an element. So let’s sum 0 to n-1 (because we remove an element first
before sifting).
n1
X
i=0
2 lg(i + 1) 2
n
X
i=1
lg i
= 2 [lg 1 + lg 2 + lg 3 + ··· + lg n]
= 2 lg(1 · 2 · 3 · ···n)
= 2 lg(n!) Stirling’s Formula n!
n
e
n
·
2πn
= 2 lg
h
n
e
n
·
2πn
i
2
h
n lg(
n
e
) + lg((2πn)
1
2
)
i
= 2
n [lg n lg e] +
1
2
lg(2πn)
= 2n lg n 2n lg e + lg n + lg(2π)
= 2n lg n + O(n)
This isn’t much better, it makes sense that it’s not much better than our conservative guess, because in a full binary
tree, half the elements are leaves. So while the tree shrinks, it doesn’t shrink very fast, asymptotically you’re still
doing Θ(2n lg n) comparisons, plus some linear term.
Implementation Details
Use array to implement tree structure.
60
100
90
70
30
10
40
50
80
20
tmp
60
30
100
50
10
70
90
20
80
40
1
2
3
4
5
6
7
8
9
10
Node has index i
Left child: 2i
Right child: 2i + 1
Parent: b
i
2
c
CREATE
The First parent is at index
b
n
2
c
. Start there and sift down during heap creation. Siblings can be reached
by adding or subtracting 1. Result is a created heap.
FINISH Pop bottom into tmp first, then move heap root into bottom spot.
18
Optimization
Why is this result still worse than merge sort? Θ(2
n lg n
) vs Θ(
n lg n
). We’re comparing tmp against both children
and this doubles our number of comparisons. Instead let’s sift the hole left by the root down to the bottom in
lg n
comparisons (1 per level) then put tmp in the hole an sift it back up into position.
How far will tmp need to go to sift back up and re-form the heap? Not far, since 1/2 the nodes are in the bottom
layer, and 1/4 of the nodes are in the 2nd, etc. It takes 1 comparison to confirm tmp belongs in the bottom layer, 2
to check the 2nd layer up. . .
1
2
· 1 +
1
4
· 2 +
1
8
· 3 +
1
16
· 4 + ··· = 2
Giving 2n comparisons on average. We can further optimize by binary searching up. This gives Heapsort
n lg n
+
n lg lg n = Θ(n lg n) performance.
Pseudoco de
1 Function Heap_Sort(A,n)is
2 // Create Heap
3 for r = b
n
2
c to 1 do
4 Sift(r,n,A[r])
5 end
6 // Finish Sort
7 for m = n To 2 do
8 s A[m]
9 A[m] A[1]
10 Sift(1,m-1,s)
11 end
12 end
13 Function Sift(r,n,s)is
14 // r:root index, n:size index, s:sift value, p:parent index, c:child index
15 p r
16 while 2p n do
17 if 2p < n then
18 if A[2p] A[2p + 1] then
19 c 2p
20 else
21 c 2p + 1
22 end
23 else
24 c 2p
25 end
26 if A[c] > s then
27 A[p] A[c]
28 p c
29 else
30 Break
31 end
32 end
33 A[p] s
34 end
Algorithm 9: Heap Sort
19
11 Finding Bounds to summation functions
Mathematical Preliminaries
Geometric Series
For |r| 1 recall:
X
i=0
r
i
=
1
1 r
Infinite Geometric Series
n1
X
i=0
r
i
=
1 r
n
1 r
=
r
n
1
r 1
Finite Geometric Series
How do we solve this? Or at least get a reasonable estimate?
X
i=0
i · r
i
=
X
i=1
i · r
i
= r ·
X
i=1
i · r
i1
= r ·
X
i=1
Z
ir
i1
0
sum of derivatives is the derivative of the sum
= r ·
X
i=1
r
i
0
= r
X
i=1
r
i
!
0
= r
1
1 r
1
0
= r
1
(1 r)
2
!
=
r
(1 r)
2
How can we check to confirm? We’ve already calculated the r = 1/2 case, where the infinite sum is 2.
1
2
1
1
2
2
= 2
Now for the finite series.
n1
X
i=0
i · r
i
=
n1
X
i=1
i · r
i
= r ·
n1
X
i=1
i · r
i1
= r ·
n1
X
i=1
Z
ir
i1
0
sum of derivatives is the derivative of the sum
= r ·
n1
X
i=1
r
i
0
20
= r
n1
X
i=1
r
i
!
0
= r
"
r
n
1
r 1
1
0
#
= r
"
nr
n1
(r 1) (r
n
1)
(r 1)
2
#
= r
"
nr
n
nr
n1
r
n
+ 1
(r 1)
2
#
= r
"
(n 1)r
n
nr
n1
+ 1
(r 1)
2
#
=
(n 1)r
n+1
nr
n
+ r
(r 1)
2
Gauss’s Sum
n
X
i=1
i =
n(n + 1)
2
Without knowing the answer already, we can see an easy upper bound. If every term is bound by it’s largest element
it can’t be larger than
n
2
. We can also get a lower bound by flooring everything to the lowest value of 1, since there
are n of them we know that the result must be larger than n. n sum n
2
.
Next we can try to split the sum in half and floor both to the lowest term in that series.
SUM = 1 + 2 + 3 + 4 + ··· +
n
2
+
n
2
+ 1 + ··· + n
1 + 1 + 1 + 1 + ··· + 1 +
n
2
+ 1 + ··· +
n
2
+ 1
n
2
+
n
2
n
2
+ 1
n
2
h
1 +
n
2
+ 1
i
n
2
n + 4
2
n
2
4
+ n
Harmonic Sum
H
n
=
n
X
i=1
1
i
= 1 + 1/2 + 1/3 + 1/4 + 1/5 + ··· + 1/n
ln n
We can approximate it using the same
technique we applied to the Gaussian Sum
1 + [1/2 + 1/2] + [1/4 + 1/4 + 1/4 + 1/4] + ···
= 1 + 1 + 1 + ··· + 1 num ones = num groups
=
k1
X
i=0
2
i
= n
k is the number of groups. The number of
21
items per group, summed, which must sum to n
= 2
k
1 = n
= 2
k
= n + 1
= k = lg(n + 1)
H
n
lg(n + 1)
Using continuous math to solve discrete math
Assume a function that’s bounded from m to n. The Riemann sum of areas in discrete terms is bounded by the
integral of some function.
Z
n
m1
f(x)dx
n
X
i=m
f(i)
Z
n+1
m
f(x)dx
Provided f(x) is increasing.
Z
n+1
m
f(x)dx
n
X
i=m
f(i)
Z
n
m1
f(x)dx
Provided f(x) is decreasing.
Gauss’s Sum
Z
n
m1
xdx
n
X
i=m
i
Z
n+1
m
xdx
x
2
2
n
0
n
2
2
n+1
i
n
2
2
0
2
2
(n + 1)
2
2
1
2
2
n
2
2
0
n
2
+ 2n + 1
2
1
2
n
2
2
n
2
+ 2n
2
n
2
2
n(n + 2)
2
Harmonic Sum
Z
n+1
m
1
x
dx
n
X
i=m
1
i
Z
n
m1
1
x
dx
ln x
n+1
1
ln(x)
n
0
ln(n + 1) ln(1) ln(n) ln 0
ln(n + 1) 0 ln(n) (−∞)
ln(n + 1) ln n +
+
Less than infinity isn’t helpful. Let’s avoid it by removing the 1 term from the sum, so we don’t integrate 0 to 1 for ln
x
1 +
Z
n
1
f(x)dx
22
1 + ln(x)
n
1
1 + ln n ln 1
1 + ln n 0
ln(n) + 1
12 Order Notation
Did you know that some algorithms can be faster than others? It’s true!
https://en.wikipedia.org/wiki/Time_complexity
Name Running time T (n) Examples of running times
Constant O(1) 10
Iterated logarithmic O(log
n)
Log-logarithmic O(log log n)
Logarithmic O(log n) log n, log(n
2
)
Polylogarithmic poly(log n) (log n)
2
Fractional power O(n
c
) where 0 < c < 1 n
1/2
, n
2/3
Linear O(n) n
N log star n O(n log
n)
Linearithmic O(n log n) n log n, log n!
Quadratic O(n
2
) n
2
Cubic O(n
3
) n
3
Polynomial 2
O(log n)
= poly(n) n, n log n, n
10
Quasi-polynomial 2
poly(log n)
n
log log n
, n
log n
Exponential (with Linear exponent) 2
O(n)
1.1
n
, 10
n
Exponential 2
poly(n)
2
n
, 2
n
2
Factorial O(n!) n!
Double exponential 2
2
poly(n)
2
2
n
Example: 17n
3
+ 24n
2
6n + 8
=17n
3
+ O(n
2
)
=17n
3
+ o(n
3
)
17n
3
n
3
Order notation can hide simple truths like a large constant in a lower order term that can seriously affect the running
time for small n. Little o means the function is always less than what’s in the bounds.
o
(
n
3
) could be
O
(
n
2.9998
)
which could throw off all of our calculations.
23
Equivalence Function Order Set
= Θ
O
< o
> ω
Informally f(n) = Θ(g(n))
lim
n→∞
f(n)
g(n)
= c > 0
lim
n→∞
7n
2
+ 3n 8
4n
2
+ 20n + 4
=
7
4
lim
n→∞
f(n)
g(n)
= c 0 f(n) O(g(n))
lim
n→∞
g(n)
f(n)
= c 0 f(n) Ω(g(n))
lim
n→∞
f(n)
g(n)
= 0 f(n) o(g(n))
lim
n→∞
g(n)
f(n)
= 0 f(n) ω(g(n))
What about non-polynomial functions? Like
f
(
n
) =
n
2
·
(10 +
sin
(
n
)) The book has a non-limiting approach which
gives us a nice way of addressing oscillating functions.
To a first approximation you can do plain algebra using Theta notation. 2
Θn
2
· 2
Θn
3
= 2
Θn
2
n
3
.
What’s faster? n
lg n
or(lg n)
n
Exponentials grow much faster than polynomials. How do we state that formally?
n
a
= o(b
n
) for a 0, b > 1
(log n)
a
= o(n
b
) b > 0
(log log n)
a
= o((log n)
b
)
n
lg n
? (lg n)
n
lim
n→∞
n
lg n
(lg n)
n
= lim
n→∞
2
lg n·lg n
2
lg (lg n)
n
= lim
n→∞
2
lg
2
n
2
n lg(lg n)
= 0
n
lg n
= o((lg n)
n
)
Polylogs grow slower than polynomials, and Quasi-polynomials grow slower than exponentials with variable bases.
24
13 Quicksort
An efficient sorting algorithm developed by Tony Hoare in 1959 while trying to translate Russian. Recursively
partition the array, put small numbers to the left and large numbers to the right.
1 Function QUICKSORT(A,p,r)is
2 if p < r then
3 q PARTITION(A,p,r)
4 QUICKSORT(A,p,q-1)
5 QUICKSORT(A,q+1,r)
6 end
7 end
8 Function PARTITION(A,p,r)is
9 x A[r]
10 i p 1
11 for j = p to r 1 do
12 if A[j] x then
13 i i + 1
14 ExchangeA[i], A[j]
15 end
16 end
17 i i + 1
18 ExchangeA[i], A[r]
19 return i
20 end
Comparison Analysis
First note that
PARTITION
is always linear in comparisons on
n
=
rp
+1 terms. The FOR loop runs
n
1 comparisons
regardless of the result of the IF condition.
Quicksort Worst case comparisons
The worst case output for
PARTITION
must occur when the pivot doesn’t move, and the recursive call to
QUICKSORT
is
only 1 smaller than the previous call. Thus, partitioning on n elements gets n-1 comparisons. Worst Case Comparisons
Occurs when pivot is at far end of array:
T (n) = T (n 1) + n 1 , T(1) = 0
= (n 1) + (n 2) + (n 3) + ··· + 3 + 2 + 1 pivot = 2
=
n1
X
i=1
i
=
(n 1) · n
2
Quicksort Best case comparisons
The best case occurs when
PARTITION
moves the pivot to the middle of the array, so that the recursive calls to
QUICKSORT
are
n1
2
smaller than the last call. Best Case Comparisons Occurs when pivot is right in the middle of
array:
T (n) = 2 · T (
n 1
2
) + n 1 , T(1) = 0
2 · T (
n
2
) + n 1
= n · lg(n) n + 1
25
= O(n · lg n)
Proof by Constructive Induction
Mathematical Induction is great at proving an answer is true if you already know it. But if you don’t know that what
the answer is for sure, you can make an educated guess, and use induction to derive the right answer. We know the
answer to the worst case comparisons of quicksort is quadratic, but we don’t know the coefficients.
For example, let’s prove 1 + 2 + 3 + ··· + n is quadratic.
n
X
i=1
i Guess
n
X
i=1
i = an
2
+ bn + c
Base case i = 1.
P
1
i=1
i = a(1)
2
+ b(1) + c = a + b + c = 1
Inductive Hypothesis:
P
n1
i=1
i = a(n 1)
2
+ b(n 1) + c
n
X
i=1
i = n +
n1
X
i=1
i
= n + a(n 1)
2
+ b(n 1) + c
= n + a(n
2
2n + 1) + b(n 1) + c
= an
2
+ (2a + b + 1)n + a b + c
If our assumption was true then we should get a system of simultaneous equations matching the quadratic. Then we
could solve the system for the coefficients of the result.
a = a
b = 2a + b + 1
c = a b + c
a =
1
2
, a = b, c = 0
n
X
i=1
i =
1
2
n
2
+
1
2
n =
n(n + 1)
2
Quicksort Worst Case from Recurrence
T (n) = T (n 1) + n 1, T (1) = 0. Let’s guess that the result is quadratic: an
2
+ bn + c
Base: n = 1, a(1)
2
+ b(1) + c = 0
Inductive Hypothesis: Assume T (n 1) = a(n 1)
2
+ b(n 1) + c
T (n) = T (n 1) + n 1
= a(n 1)
2
+ b(n 1) + c + n 1
= a(n
2
2n + 1) + b(n 1) + c + n 1
= an
2
(2a + b + 1)n + c b 1 + a
Find coefficients
a + b + c = 0
a = a
2a + b + 1 = b
c 1 + a b = c
a =
1
2
b =
1
2
c = 0
26
T (n) =
1
2
n
2
1
2
n
=
n(n 1)
2
Quicksort Best Case from Recurrence
T (n) = 2T (n/2) + n 1, T (1) = 0. Guess upper bound: T (n) = an lg n
Base: n = 1, a(1) lg(1) = 0
Inductive Hypothesis: By strong induction, assume T(k) = ak lg k for all k < n
T (n) = 2T (n/2) + n 1
2
h
a ·
n
2
· lg
n
2
i
+ n 1
= an[lg n lg 2] + n 1
= an[lg n 1] + n 1
= an lg n an + n 1
= an lg n + (a + 1)n 1
To be true, must be an lg n
drop -1 because <
(a + 1)must be 0, a 1
So a constant a greater than or equal to 1 solves the recurrence, but the best case value occurs when a is 1
T (n) 1 · n lg n
= n lg n
Average case analysis of quicksort from approximate recurrence.
Best case occurs when pivot is in the middle, while worst case occurs when pivot is at the end. Let’s approximate the
average case happening when the pivot is at the one and three quarter marks.
T (n) = T (
3n
4
) + T (
n
4
) + n 1 , T (1) = 0
Let’s guess optimistically that our upper bound is T (n) an lg n
Base case n=1: a(1) lg(1) = 0 0X
Inductive Hypothesis: By strong induction, assume true for k < n T (k) ak lg k
T (n) = T (
3n
4
) + T (
n
4
) + n 1
a
n
4
lg
n
4
+ a
3n
4
lg
3n
4
+ n 1
= a
n
4
(lg n lg 4) +
3an
4
(lg n lg 4 + lg 3) + n 1
= a
n
4
lg n
an
2
+
3an
4
lg n
3an
2
+ lg 3 ·
3an
4
+ n 1
= an lg n +
a
2
3a
2
+
a lg 3
4
+ 1
n 1
= an lg n +
2a +
a
4
lg 3 + 1
n 1
To be true, must be an lg n
2a +
a
4
lg 3 + 1 0
27
(
lg 3
4
2)a 1
a
1
2
3 lg 3
4
T (n)
1
2
3 lg 3
4
!
n lg n
T (n) 1.23 · n lg n
Average case analysis of quicksort from exact recurrence.
When we choose a pivot element, the pivot will end up at some index q. We don’t know where q is yet. Since we will
be iterating over the whole array, the probability of choosing a particular index is 1/n. Then you have to do the group
on the left and the group on the right. And do it for all n indices. Solving this will require everything we know to do.
T (n) =
n
X
q =1
1
n
(T (q 1) + T (n q)) + n 1
=
1
n
n
X
q=1
(T (q 1) + T (n q)) + n 1
=
1
n
n
X
q=1
(T (q 1)) + (
1
n
n
X
q =1
T (n q)) + n 1
first is sum of Ts from 0 to n-1 and the other is the downward sum from n-1 to 0. So we get two sums of T
=
2
n
n1
X
q=0
T (q) + n 1
now apply strong constructive induction, guess an lg n
Base n =1a(1) lg(1) = 0X
I.H. For 1 k < n T(k) ak lg k
T (n) =
2
n
n1
X
q=1
T (q) + n 1
2
n
n1
X
q=1
aq lg q + n 1
=
2a
n
n1
X
q =1
q lg q + n 1
2a
n
Z
n
1
x lg xdx + n 1
=
2a
n
x
2
lg x
2
x
2
lg e
4
+ c
n
1
+ n 1
=
2a
n
n
2
lg n
2
n
2
lg e
4
+ c (
1
2
lg 1
2
1
2
lg e
4
+ c)
+ n 1
=
2a
n
n
2
lg n
2
n
2
lg e
4
+
lg e
4
+ n 1
= an lg n
an lg e
2
+
a lg e
2n
+ n 1
28
= an lg n + (
a lg e
2
+ 1)n 1 +
a lg e
2n
need
a lg e
2
+ 1 0
a
2
lg e
= 2 ln 2 1.39
choose a = 1.39 and the last term becomes a number always less than or equal to 1, which is dominated by minus 1
and the second highest term becomes 1 minus 1=0
T (n) 1.39n lg n
=
2
lg e
n lg n
= 2n ln n
New Book method
Given What is the probability that the smallest and largest elements are compared? (
x
1
, x
n
)?
2
n
If you pick the
largest or smallest as the pivot, then it will compare against every other element including the other extreme. But if
you don’t pick an extreme the two will be partitioned away and never be compared. So on average the number of
comparisons you get from comparing the smallest and largest is also
2
n
.
What is the probability that x
i
, x
j
: (i < j) will be compared? If you choose a pivot less than i, the two will end up
on the same pivot together on the large side. If you choose a pivot greater than j, the two will end up on the small
side together. The only way not to compare the two is to choose a pivot in between i and j. That probability is
2
ji+1
If you have 2 elements next to one another in the sorted array, they will always be compared. Otherwise how will you
know?
2
(i+1)i+1
= 1
T (n) =
n
X
i=1
n
X
j=i+1
=
n
X
i=1
n
X
j=i+1
2
j i + 1
= 2
n
X
i=1
n
X
j=i+1
1
j i + 1
= 2
n
X
i=1
(H
ni+1
1)
= 2
n
X
i=1
(H
i
1)
= 2
n
X
i=1
H
i
2
n
X
i=1
1
Big important insight, that you can calculate the probability that any 2 arbitrary elements can be compared.
Is Quicksort In place?
In place algorithms use a constant amount of extra space, it isn’t a function of the number of elements n. Because
Quick sort is recursive it can add more variables to the stack (multiple versions of q, and r). So to be in place you use
extra memory Θ(1) for algorithm variables + Θ(
log n
) stack average. There isn’t really a formal definition of In place.
Informally it’s that an algorithm doesn’t use a lot of extra space.
29
One way to avoid using much extra space is to check which side of a partition is smaller and do that side of the
partition first.
Randomized pivot selection
You can use a random pivot element as seen in the book.
3 Median Pivot
You can make sure that you’ve got a good pivot by picking 3 elements randomly and using their median as the pivot.
This gives you a much better chance of pivoting near the middle.
Small Size Optimization
In practice when qucksort gets down to 10 to 20 elements many implementations use a more efficient algorithm like
insertion sort to sort the small groups of elements.
14 Algorithm Strengths and Weaknesses
Algorithm Comparisons Moves Spatial Locality In Place Good Worst Case
Merge Sort n lg n unknown X X X
Heap Sort n lg n n lg n X X X
Quick Sort 1.39n lg n
1
2
n lg n X X X
Memory Hierarchy
Memory Hierarchy allows further, machine specific optimizations. Doing a lot of work in fast memory is preferable to
waiting to grab data from slower memories Merge sort and quick sort both take advantage of this spacial locality of
fast memory. It sub sorts the array in pieces that fit into caches. Heap sort doesn’t as it’s sorting mechanism jumps
all over the tree. It lacks spatial locality. Merge Sort also has the problem that it isn’t In-place, so it uses a lot of
extra memory.
15 Linear Sorting Algorithms
Re-evaluating Sort
All of our algorithms have thus far been strongly dependent upon comparisons. Let’s take a closer look at that.
Find biggest element in array
Run one instance of selection sort, find biggest element. n 1 comparisons.
Find second largest element in array
First idea: form max heap and pull two off the top. Create Heap + sift: n + lg n
Find k largest elements in array
Sift again: n + (k 1) lg n
30
Natural approach: Hold a Tournament
Comparisons are like playing games in a sportsball match. You need to eliminate
n
1 teams to find the best, so you
must at least run
n
1 comparisons. This gives us a lower bound. In a double elimination tournament when one of
the playing groups loses it must lose again to be eliminated. The superior team will never lose. All other teams must
lose at most twice. If you lose to someone and they lose to someone else, you’re no longer second best. So you don’t
need to incur an extra comparison. If you keep track of who lost to whom you’ll conserve comparisons and be bound
by the height of the tournament bracket,
lg n
1 When you run the tournament you incur
n
1 to find the best
player. Then another lg n 1 to order everyone else, giving n + lg n comparisons.
Finding Best and Worst
Run tournament to find best, then take all teams who lost both matches (
n/
2), and run again in reverse. There are
n
2
1 following comparisons to find the worst team. n 1 +
n
2
1 = n +
n
2
2
3
2
n.
16 Counting Sort
Say you have a list of duplicate integers in a range and you want to sort them.
n
integers in the range 0
, . . . , k
1.
Sort A into B. So simply count the number of integers of each size, and dole them out in order.
1 for i = 0 to k 1 do
2 C[i] 0;
3 end
4 for j = 1 to n do
5 C[A[j]] C[A[i]] + 1;
6 end
7 t 0;
8 for i = 0 to k 1 do
9 for j = 0 to C[i] do
10 t t + 1;
11 B[t] i;
12 end
13 end
Algorithm 10: Counting Sort
The running time is the number of numbers plus the range of the numbers. Θ(
n
+
k
). If you’re sorting a lot of
numbers in a small range, it’s the numbers that will dominate. If it’s a large range with sparse numbers, the range
will dominate.
An alternative method is to form the partial sums of C. Where each value in C is the number of elements less than or
equal to the index. Then iterate backwards through the original array. When you encounter an element in A, lookup
that index in C, then assign an element into B, then decrement the value in C. This has the added value of treating
each element as a unique entity, instead of as simply an integer. So you could do this on comparable objects.
1 for i = 0 to k 1 do
2 C[i] 0;
3 end
4 for j = 1 to n do
5 C[A[j]] C[A[i]] + 1;
6 end
7 for i = 1 to k 1 do
8 C[i] C[A[i]] + 1;
9 end
10 for j = n down to 1 do
11 B[C[A[j]]] A[j];
12 C[A[j]] C[A[j]] 1;
13 end
Algorithm 11: Partial Sum Counting Sort
31
Running time Θ(n + k).
Now the C array represents the number of numbers less than the value of the index.
Why do we go through A backwards? To maintain stability.
17 Radix Sort
Sort on the least significant digit, then the 2nd least significant digit, etcetera until the largest digit arrives. This only
works if a stable sort is used for each intermittent pass over a digit.
Proved by example in lecture. Proved by induction on the intermittent sorts.
Attribute Value Variable
Size 10 n
Radix 9 r
Digits 3 d
Running Time: Θ(d(n + r))
Analysis of running time
First let’s decide which radix is best for generic input. Let’s introduce a new parameter: Size of Values:
s
. This is the
total range of numbers.
s, d, r
are all related by
s
=
r
d
. If given 6 digit numbers in base 10,
s
= 10
6
. Taking logs we
get
log
r
s
=
d
This gives us the new Running time of Θ((
log
r
s
)(
n
+
r
)). Taking out the r is trickier. Let’s think
about optimizing this function. We can optimize by trying to find a minimum with respect to r. We can do that by
taking the derivative and solving for 0.
ln s
ln r
(n + r)
0
= (ln s)
ln r
1
r
(n + r)
(ln r)
2
= (ln s)
ln r
n
r
(ln r)
2
= 0
0 = ln r
n
r
r =
n
ln r
r =
n
ln(
n
ln n
)
=
n
ln n ln ln n
n
ln n
So surprisingly the running time of radix sort can depend entirely on the input size, and not on the range of values.
Θ(
ln s
ln(
n
ln n
)
(n +
n
ln n
))
Θ(
n ln s
ln n
)
But that’s not a great number, so in real life pick the closest power of 2. It’s nicest for the computer, and a digit
looks like a set of bits. Take every group of r bits and compare based on those bits and return them. So we want to
use Radix sort when it’s running time is less than Quicksort.
Θ
n log s
log n
< Θ (n log n) Θ (log s) < Θ
log
2
n
log s < Θ (log n) log n = log n
Θ(log n)
s < n
Θ(log n)
So Radix sort is theoretically better when the range is relatively small. But for 1 million numbers in binary, that
gives us a very large range: 1, 000, 000
20
It’s space-wise inefficient, but has spatial locality.
32
18 Bucket Sort
Assume you have a uniformly distributed array of real numbers between 0 and 1. Create n buckets.
1 Clear B;
2 for i = 1ton do
3 Put A[i] into bucket B [bn · A[i]c];
4 end
5 Sort each bucket;
6 Concatenate each bucket;
On average there will be 1 element per bucket. If you can rely on the buckets in the worst case not being very large
you can use a quadratic sort within the bucket and still get a linear running time.
You can even apply this to non uniform distributions. Assume a Normal distribution. Simply change the size of the
buckets relative to the mean, with buckets closer to the mean having a smaller range and ones farther from the range
on either side being larger.
Also space wise inefficient, but has spacial locality.
19 Selection
How do we select the kth smallest element from a group of numbers? How do we select the median from a group of
numbers? Median = n +
n
2
lg n
1 Function SELETCT(A,k,p,r)is
2 Find Approximate median for ‘qth smallest’;
3 Partition based on ‘qth smallest’;
4 if k < q p then
5 SELETCT(A,k,p,q-1);
6 end
7 else if k>q-p+1 then
8 SELETCT(A,k-q-p+1,q+1,r);
9 end
10 else
11 k=q-p+1
12 end
13 returnq,A[q];
14 end
Take the list and find some approximate median, then look on the left or right side depending on how k compares to
our approximate median.
Analyzing the recurrence
It sounds like selection should be a very important problem, but in real life it doesn’t come up very much.
Lower Bound
Let’s assume we know the true median, or that our approximate median is good enough.
T (n) = n 1 + T
d
n 1
2
e
T (
n
2
) + n 1
2n 1 lg n
2n
33
So if we got really lucky and got the true median we could find the kth element in linear time. As such we have a
reasonable assumption for our lower bound. You’ll never do better than 2n comparisons in a real case when you don’t
have the perfect median.
Constrained case analysis
Let’s assume that we get a q at the
1
4
mark, and assume that our k is always on the larger side.
T (n) = n 1 + T (
3n
4
)
= T (
3n
4
) + n 1
(n 1) + (
3
4
n 1) + (
3
4
2
n 1) + . . .
=
1
1 3/4
n c log n
4n
For the general fractional partition.
T (n) = n 1 + T ((1 r)n)
= T ((1 r)n) + n 1
(n 1) + ((1 r)n 1) + ((1 r)
2
n 1) + . . .
=
1
1 1 + r
n c log n
1
r
n
Absolute worst case q.
T (n) = n 1 + T (n 1)
= (n 1) + (n 2) + (n 3) + . . .
=
(n 1)n
2
n
2
2
Average case Analysis
Sum over all possible pivots and let’s assume you always end up on the bigger side, for the pessimistic view. So we
get the probabilistic analysis.
T (n) = n 1 +
n
X
q =1
1
n
· T (max(q 1, n q))
=
2
n
n1
X
n
2
T (q)
Constructive Induction, guess the algorithm is linear
guess T (n) an
34
= n 1 +
2
n
n1
X
n
2
aq
= n 1 +
2a
n
n1
X
n
2
q
=
2a
n
n1
X
q=1
n
2
1
X
q=1
+ n 1
=
2a
n
n(n 1)
2
n
2
(
n
2
1)
2
+ n 1
=
2a
n
n
2
2
n
2
n
2
8
n
4
+ n 1
= an a
an
4
+
a
2
+ n 1
=
3
4
an
a
2
+ n 1
= (
3
4
a + 1)n
a
2
1 for induction to work an a 4
T (n) 4n
20 Finding Median explicitly during selection
Take the numbers and put them into a 2 dimensional grid with 5 rows and n/5 columns.
Computer Memory is a 1 dimensional array, meaning that representing a 2 dimensional array requires a clever trick
to access. A[i,j] Represented differently in row major order vs column major order. Column major order A[5(i-1) + j]
Without using very specific pseudocode, let’s look theoretically at how something like this algorithm might work at a
very high level.
1 Put items into 5x
n
5
grid
2 Bubble Sort
3 // Find median of each column. 10*n/5 = 2n comparisons.
4 Find the median of these medians recursively. // smaller subset of elements it should take T (n/5)
comparisons.
5 Move columns with small medians left and columns with large medians right. // This puts the median of
medians in the middle element.
6 Partition using the median of medians. // Partitioning takes n 1 comparisons.
7 Recursively select on correct side, knowing what we know about median of medians. // Takes T (
7n
10
)
comparisons.
           
           
      M      
           
           
Worst Case analysis
About 25% of elements are guaranteed to be smaller, and about 25% are guaranteed to be bigger than the median of
medians. The exact value is
3
10
n
This gives us an upper bound on the size of a recursive call to partition at
7
10
n
,
which is the remaining number of elements to recurse over.
35
3
10
n
7
10
n remaining number of elements to recurse over
T (n) 2n + T (
n
5
) + T (
7
10
n) + n 1
= T (
n
5
) + T (
7
10
n) + 3n 1 Constructive induction, guess T (n) an
a
n
5
+ an
7
10
+ 3n 1
=
9
10
+ 3
n 1
need an = need
9
10
a + 3 a a 30
T (n) 30n
Median is doable in linear time, but 30 is a very large coefficient. Compare it to quick sort at
n lg n
,n must be
2
30
for our linear median selection to be better.
Theory
Say the upper bound, as some researchers claim to have found, that the upper bound on median selection is 3n.
2n < T (n) < n3n
(2 + ε)n (3 δ)n
T (n)
5
2
n
21 Graph Algorithms
Graph description.
Undirected graph List representation is a bit awkward because it double counts shared edges. Altering an edge on
one vertex requires altering the edge on its partner too. Requiring that you traverse both lists. One awkward way to
alleviate this is to keep extra pointers between the list elements in shared edges.
Memory Usage
Matrix, where n = number of vertices, Memory usage = Θ(n
2
)
List, where m = number of edges, Memory usage = Θ(m + n)
How big can m be? M can be as large as n
2
, but will probably be smaller.
You could also use a bit matrix of size
n
2
word size
.
Edge Lookup
List: Θ(n) Matrix: Θ(1)
36
List all Edges
List, Go down each index, then traverse each list. Amortized analysis is to visit n elements. Θ(m + n).
Matrix, Go across each row, looking at all the 0s, even when you don’t care about them. Θ(n
2
).
Take aways
Typically we want to use the Adjacency list representation, because it’s linear for in the average case.
Changing the number of nodes
Adding Nodes
Add dummy values that you can fill later to the next power of 2. If you have 11 nodes build a matrix with 16 columns
and just read the first 11, when a new node is added just use the dummy space, when you reach 17 you need to copy
the old matrix into the new one of size 32.
Removing Nodes
Shrink the array in a way similar to adding a node.
Depth First Search
1 Mark v at visited;
2 for all v such that (u,v) is an edge and v has not been visited ;
3 DFSv;
Adjacency Matrix: O(n
2
)
Adjacency List: O(n + m)
Applications
Identify Connected Components.
Breadth First Search
While depth first search uses a recursive call, or a stack (which are naturally similar in modern machines) BFS uses a
queue, which while similar in order analysis for most things is not naturally represented in a stack based machine.
Does have some nice applications, like shortest distance algorithms.
Identifying Connected Components
A graph is connected if there is a path from every vertex to every other vertex in the graph.
37
A connected component is a connected subgraph, or a connection of such components.
1 Function Component(G)is
2 for all x V [G] do
3 visited[x] F alse;
4 t 0;
5 end
6 for all x V [G] do
7 if not visited[x] then
8 t t + 1;
9 Compvisit [x];
10 end
11 end
12 end
13 Function Compvisit(x)is
14 visited[x] T rue;
15 CompN um[x] t;
16 for
all y adj[x] do
17 if not visited[y] then
18 Compvisit[y];
19 end
20 end
21 end
Algorithm 12: Connected Component
Adjacency Matrix: O(n
2
)
Adjacency List: O(n + m)
Bridges of Köningsberg
Theorem 21.1
(Eulerian Cycle 1736)
.
An undirected, (connected) graph
G
= (
v, e
) has an Eulerian Cycle if and
only if every vertex is connected and every vertex has even degree.
22 Shortest Path Algorithms
Types
Single source, Single sink.
All Pairs.
Single source.
It’s very hard to solve the Single source, single sink problem without also solving the single source problem for every
node along the way. Same is true for All pairs.
38
Dijkstra’s Algorithm
1 for all x V [G] do
2 // Θ(n)
3 D[x] ;
4 Π[x] NIL;
5 end
6 Q V [G] // Q will hold the vertices which haven’t been checked off yet
7 D[a] 0;
8 while Q 6= do
9 // Θ(n)
10 x Pop-minimum[Q];
11 for all y adjacent to x in Q do
12 // Depends on implementation. Can be adjacency matrix, checkable in Θ(n)
13 if D[x] + W [x, y] < D[y] then
14 // W[x,y] is the weight of edge x,y
15 D[y] D[x] + W [x, y];
16 Π[y] x;
17 end
18 end
19 end
This bears a similarity to Breadth First Search. From the source, you reach the vertices separated by 1 edge first,
then those separated by 2 edges, then 3, etc. This forms an expanding shell of visited nodes with known shortest
distance. We have connections to edges outside the shell, with known edge weights. Since we are picking the smallest
such edge, the BFS-like process guarantees that you determine the shortest path to that vertex.
This only works when you have non-negative weight edges.
Correctness Proof
Base case: Nothing is in the circle, except the source vertex with distance 0.
Inductive Step: Remove smallest edge weight from consideration and update connecting vertices if smaller.
Since you know the shortest path to everything in the circle, the available path to the next shortest edge is guaranteed
to be the shortest path to that vertex.
Implementation details
Create Q as array. Let Q hold 1’s for vertices that haven’t been eliminated yet, and 0’s otherwise. Naive approach is
to travel Q looking for 1, lookup weight in D and track smallest value. If implemented with Adjacency Matrix, the
whole thing done in Θ(n
2
).
The alternative is to use a priority queue / heap for Q, removing the smallest value. This necessitates some extra
heap functions to grab an arbitrary node in the heap and change its weight. That let’s you pop from the heap in
logarithmic time. There’s an additional requirement to visit
|adj
(
x
)
|
additional vertices, and update them with our
heap manipulation function to update weight, which is an additional Θ(
lg n
) Since Dijkstra does this for every element
the running time is Θ(m · lg n).
So if your graph is dense it’s better to use the Adjacency Matrix algorithm, which is quadratic in the number of
vertices. If your graph is sparse it’s better to use the m log n algorithm with a priority queue.
Of course in reality the quadratic, Adj. Matrix algorithm is usually better at the low level because of spatial locality.
23 NP-Completeness
Goal is to separate problems easy to solve from those hard to solve. Loosely speaking easy means polynomial time
and hard means exponential time.
These are not perfect definitions, in spite of that fact it’s still okay.
Nature is kind to us. Natural problems that are polynomial time tend to have low degree. Not a theorem, just
an empirical statement.
39
We don’t really know what a computer is. Real world computers have memory hierarchy and variable running
times, there’s concurrency as well. All models are equivalent up to polynomial time. One exception is quantum
computers, but not really proven in reality.
Polynomials are closed under addition, multiplication, and composition.
Decision Problem Theory
Decision problems are yes/no questions.
Does a graph have an Eulerian cycle? But not What is the Eulerian Cycle.
Definition 23.1 (P). Decision problems solvable in polynomial time.
Definition 23.2 (NP). Decision problems verifiable in polynomial time with a certificate.
Problems in the class NP have the dichotomy that if the answer is yes you can demonstrate it in polynomial time, but
many of them are not able to be disproved in polynomial time. When the answer is yes it’s easy to prove, otherwise
it’s very hard.
Certificate is usually the solution to the problem. Example would be the coloring problem from the homework.
Definition 23.3
(SAT)
.
Satisfiability Problem Given a Boolean formula, is there a way of setting the variables to
true and false such that the formula is true?
If it is then it can be evaluated in polynomial time, the certificate is the assignment. If it isn’t satisfiable the only way
to show that is to try every possibility of assignments, which takes exponential time.
The Class of NP problems
Is it the case that if something can be solved in polynomial time it can be verified in polynomial time? Does
P
=
NP
?
Or more directly, are P and NP-Complete classes distinct? No one knows.
NP-Complete: The class of problems that are all equivalent to one another under reduction. Formula-SAT, circuit-SAT
and coloring are all the same problem, really. They can be reduced to one another.
Most things in real life are in the NP class. Empirical Statement: For problems occurring in nature, Once in the class
NP, it’s either P or NP-Complete. There’s a theorem that says that if
P 6
=
NP Complete
then there are an infinite
class of problem between them.
Reduction
Having proved that a problem is NP complete, how do we prove other problems are also NP complete?
Theorem 23.1. Circuit SAT is NP-complete.
Proof.
- Show in NP - Show hard for NP - Show that Circuit SAT is at least as hard as some other problem which we
know is NP-complete.
Formula SAT
p
Circuit SAT
Circuit SAT
p
Formula SAT
Converting from circuit to formula is harder than the other way. But the two are still verifiable in the same class of
running time.
Traveling Salesman Problem
Eulerian Cycles require that you hit every path once, while Hamiltonian Cycles require that you hit every vertex once.
This is the crux of the traveling salesman problem. Hamiltonian Cycles are NP-Complete.
The Traveling Salesman problem is this problem of finding the shortest path that hits all the nodes in a graph.
Because the traveling salesman problem isn’t a yes no question the technically correct way of formulating it is to ask
if there is a path with length less than some target length.
This is a generalization of the Hamiltonian Cycle problem, which is NP complete, therefore the traveling salesman
problem is NP complete.
40
theory
There were generalizable characteristics of the Eulerian cycle problem that allowed Mathematicians to determine
proofs around them, but the Hamiltonian Cycle problem doesn’t have any generalizable characteristics like that and
it wasn’t until the advent of the NP completeness theorem that this was understood.
Problems between P and NP-Complete
Factoring
Discrete Logarithm
Graph Isomorphism
Their exact location isn’t known.
Exam Materials
Must prove a problem is in class NP by stating what the cert is and showing how to use the cert to verify.
No Reductions.
Show that you can use the fact of the class of the decision problem form of the question to derive the class of the
optimization problem form. If you can determine the yes/no condition of whether a formula is satisfiable, you can
determine the certificate for that factor in Polynomial time.
Formula SAT
Assume Decision version of Formula SAT is in P. We want to find the assignment.
Given Formula A variables x
1
, x
2
, . . . , x
n
1 if not SAT-decidability(A) then
2 return False
3 end
4 for i = 1ton do
5 B A;
6 Assign True to x
i
in A // Every time you see xi in the formula substitute with true
7 Simplify to remove True;
8 if not SAT-decidability(A) then
9 A B;
10 Assign False to xi and simplify;
11 end
12 end
13 return X;
Graph Coloring Problem
Given an integer k, is a graph k-colorable?
That’s the decision problem. You can use that to solve the real question of what’s the smallest number k such that a
41
graph is k-colorable.
1 c 1;
2 while not Colorable(G,c) do
3 c c + 1;
4 end
5 for i=1 to n do
6 for d =1 to c do
7 Set vertex i to color d;
8 if Colorable(G’,d) then
9 exit;
10 end
11 end
12 end
Assign arbitrary colors to the graph nodes one at a time and then ask the routine if the graph is still c-colorable. But
assignment of colors isn’t allowed in our routine. So we need to use a trick.
Make a complete graph of c vertexes. Then connect our goal vertex in the original graph G to every vertex except the
color we want. This forces the color determiner to require a color at that vertex.
Size of G
0
= |G| + c
2
+ (c 1)n
42