Post

IP University - ADA - Unit 3 Notes

This post gives a quick review on the third unit of the syllabus for the Algorithm Design and Analysis (ADA) subject in the IP University syllabus. It is a continuation to the second post on the subject.

Greedy Algorithms

A greedy algorithm, as the name suggests, always makes the choice that seems to be the best at that moment. This means that it makes a locally-optimal choice in the hope that this choice will lead to a globally-optimal solution.

How do you decide which choice is optimal?

Assume that you have an objective function that needs to be optimized (either maximized or minimized) at a given point. A Greedy algorithm makes greedy choices at each step to ensure that the objective function is optimized. The Greedy algorithm has only one shot to compute the optimal solution so that it never goes back and reverses the decision.

Some advantages and disadvantages:

  • easy to come up with a greedy algorithm (or even multiple greedy algorithms) for a problem.
  • analyzing the run time for greedy algorithms will generally be much easier
  • you have to work much harder to understand correctness
  • most greedy algorithms are not correct.

Where to use Greedy algorithms?

A problem must comprise these two components for a greedy algorithm to work:

  1. It has optimal substructures. The optimal solution for the problem contains optimal solutions to the sub-problems.
  2. It has a greedy property (hard to prove its correctness!). If you make a choice that seems the best at the moment and solve the remaining sub-problems later, you still reach an optimal solution. You will never have to reconsider your earlier choices.

Matroids

Matroids are combinatorial structures that generalize the notion of linear independence in matrices. There are many equivalent definitions of matroids, we will use one that focus on its independent sets. A matroid \(M\) is defined on a finite ground set \(E\) (or \(E(M)\) if we want to emphasize the matroid \(M\)) and a collection of subsets of \(E\) are said to be independent. The family of independent sets is denoted by \(I\) or \(I(M)\), and we typically refer to a matroid \(M\) by listing its ground set and its family of independent sets: \(M = (E, I)\). For \(M\) to be a matroid, \(I\) must satisfy two main axioms:

  • if \(X ⊆ Y\) and \(Y ∈ I\) then \(X ∈ I,\)
  • if \(X ∈ I\) and \(Y ∈ I\) and \(\|Y \| > \|X\|\) then \(∃e ∈ Y / X : X ∪ {e} ∈ I\)

Activity Selection Problem

Problem: Given a set \(A = \{A_1, A_2, · · · , A_n\}\) of \(n\) activities with start and finish times \((s_i,f_i)\), \(1 ≤ i ≤ n\), select maximal set \(S\) of “non-overlapping” activities.

Approach:

  • Sort activity by finish time (let \(A_1, A_2, · · · , A_n\) denote sorted sequence)
  • Pick first activity \(A_1\)
  • Remove all activities with start time before finish time of \(A_1\)
  • Recursively solve problem on remaining activities.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
void printMaxActivities(int s[], int f[], int n) {
    // s[] is array of starting times
    // f[] is array of finishing times
    // n is the number of activities
    int i, j;
    printf("Following activities are selected \n");
    i = 0;
    printf("%d ", i);
    for (j = 1; j < n; j++) {
        if (s[j] >= f[i]) {
            printf("%d ", j);
            i = j;
        }
    }
}

The time complexity of this algorithm is \(O(n \log n)\) for the sorting, and \(O(n)\) for the selection.

Fractional Knapsack Problem

Problem : Given weights and values of \(n\) items, we need to put these items in a knapsack of capacity \(W\) to get the maximum total value in the knapsack. You can take fractional parts of objects.

Approach: The basic idea of the greedy approach is to calculate the ratio value/weight for each item and sort the item on basis of this ratio. Then take the item with the highest ratio and add them until we can’t add the next item as a whole and at the end add the next item as much as we can. Which will always be the optimal solution to this problem.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
bool cmp(struct Item a, struct Item b) {
    double r1 = (double) a.value / a.weight;
    double r2 = (double) b.value / b.weight;
    return r1 > r2;
}
double fractionalKnapsack(int W, struct Item arr[], int n) {
    sort(arr, arr + n, cmp);
    int curWeight = 0;
    double finalvalue = 0.0;
    for (int i = 0; i < n; i++) {
        if (curWeight + arr[i].weight <= W) {
            curWeight += arr[i].weight;
            finalvalue += arr[i].value;
        } else {
            int remain = W - curWeight;
            finalvalue += arr[i].value * 
                    ((double) remain / arr[i].weight);
            break;
        }
    }
    return finalvalue;
}

The time complexity of this algorithm is \(O(n \log n)\) for the sorting, and \(O(n)\) for the selection.

Huffman Encoding

Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding characters. The most frequent character gets the smallest code and the least frequent character gets the largest code.

The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences) are assigned in such a way that the code assigned to one character is not prefix of code assigned to any other character. This is how Huffman Coding makes sure that there is no ambiguity when decoding the generated bit stream.

Building the Huffman Tree

Input is array of unique characters along with their frequency of occurrences and output is Huffman Tree.

  1. Create a leaf node for each unique character and build a min heap of all leaf nodes (Min Heap is used as a priority queue. The value of frequency field is used to compare two nodes in min heap. Initially, the least frequent character is at root)
  2. Extract two nodes with the minimum frequency from the min heap.
  3. Create a new internal node with frequency equal to the sum of the two nodes frequencies. Make the first extracted node as its left child and the other extracted node as its right child. Add this node to the min heap.
  4. Repeat (2) and (3) until the heap contains only one node. The remaining node is the root node and the tree is complete.

Printing codes from Huffman Tree:

Traverse the tree formed starting from the root. Maintain an auxiliary array. While moving to the left child, write 0 to the array. While moving to the right child, write 1 to the array. Print the array when a leaf node is encountered.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
\\ C is the set of n characters and related information
Huffman(C):
    n = C.size
    Q = priority_queue()
    for i = 1 to n
        n = node(C[i])
        Q.push(n)
    end for
    while Q.size() is not equal to 1
        Z = new node()
        x = Q.pop
        Z.left = x
        y = Q.pop
        Z.right = y
        Z.frequency = x.frequency + y.frequency
        Q.push(Z)
    end while
    Return Q

The time complexity is \(O(n \log (n))\) where \(n\) is the number of unique characters. If there are \(n\) nodes, pop() is called \(2(n-1)\) times. pop() takes \(O(\log (n))\) time for priority queue. So, overall complexity is \(O(n \log(n))\).

Job Scheduling Problem

Problem : Given an array of jobs where every job has a deadline and associated profit if the job is finished before the deadline. Also, every job takes single unit of time. Maximize total profit if only one job can be scheduled at a time.

Approach: The basic idea of the greedy approach is to sort the jobs in order of decreasing profit. Then for each job, starting with most profitable :

  • if a slot exists for it before the deadline, assign the job a slot as close to the deadline as possible.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
void printJobScheduling(Job arr[], int n) {
    sort(arr, arr + n, comparison); // decreasing profit
    int result[n]; // to store result
    bool slot[n]; // to check availablity of slot
    for (int i = 0; i < n; i++)
        slot[i] = false; // initially all slots empty
    for (int i = 0; i < n; i++) {
        for (int j = min(n, arr[i].dead) - 1; j >= 0; j--) {
            // starting at deadline and moving back
            if (slot[j] == false) {
                // if empty slot found
                result[j] = i;
                slot[j] = true;
                break;
            }
        }
    }
    // printing result
    for (int i = 0; i < n; i++)
        if (slot[i])
            cout << arr[result[i]].id << " ";
}

The time complexity of this algorithm is \(O(n^2)\).

Minimum Spanning Tree

Given an undirected and connected graph \(G=(V,E)\), a spanning tree of the graph \(G\) is a tree that spans \(G\) (that is, it includes every vertex of \(G\)) and is a subgraph of \(G\)(every edge in the tree belongs to \(G\))

The cost of the spanning tree is the sum of the weights of all the edges in the tree. Minimum spanning tree is the spanning tree where the cost is minimum among all the spanning trees. There also can be many minimum spanning trees.

Sublime's custom image

If the spanning tree is formally defined as \(S=(V',E')\), then \(\|V'\|=\|V\|\) and \(\|E'\|=\|V'\|-1)\). Also, no of spanning trees = \(^{\|E\|}C_{\|V\|-1}-\)number of cycles

There are two famous algorithms for finding the Minimum Spanning Tree:

Kruskal’s Algorithm

Kruskal’s Algorithm builds the spanning tree by adding edges one by one into a growing spanning tree. Kruskal’s algorithm follows greedy approach as in each iteration it finds an edge which has least weight and add it to the growing spanning tree.

Algorithm Steps:
  • Sort the graph edges with respect to their weights.
  • Start adding edges to the MST from the edge with the smallest weight until the edge of the largest weight.
  • Only add edges which doesn’t form a cycle , edges which connect only disconnected components.

To check if vertices are connected or not, use the disjoint-set data structure.

1
2
3
4
5
6
7
8
9
10
KRUSKAL(G):
A is the solution set
1 A = ∅
2 foreach v ∈ G.V:
3    MAKE-SET(v)
4 foreach (u, v) in G.E ordered by weight(u, v), increasing:
5    if FIND-SET(u) ≠ FIND-SET(v):
6       A = A ∪ {(u, v)}
7       UNION(u, v)
8 return A

Kruskal’s algorithm can be shown to run in \(O(E \log E)\) time, or equivalently, \(O(E \log V)\) time.

Prim’s Algorithm

Prim’s Algorithm also use Greedy approach to find the minimum spanning tree. In Prim’s Algorithm we grow the spanning tree from a starting position. Unlike an edge in Kruskal’s, we add vertex to the growing spanning tree in Prim’s.

Algorithm Steps:
  • Maintain two disjoint sets of vertices. One containing vertices in growing spanning tree and other that are not in it.
  • Select the cheapest vertex that is connected to the growing spanning tree and is not in it and add it, using priority queues.
  • Insert the vertices, that are connected to growing spanning tree, into the priority queue.
  • Check for cycles. To do that, mark the nodes which have been already selected and insert only unmarked nodes.
1
2
3
4
5
6
7
8
9
PRIMS(G):
A is the solution set
1 T = ∅
2 U = { 1 };
3 while (U ≠ G.V)
4     let (u, v) be cheapest edge such that u ∈ U and v ∉ U;
5     A = A ∪ {(u, v)}
6     U = U ∪ {v}
7 return A

Using a simple binary heap data structure, Prim’s algorithm can now be shown to run in time \(O((|E|+|V|)\log|V|)\). Using a more sophisticated Fibonacci heap, this can be brought down to \(O(|E|+|V|\log|V|)\).

Comparison

Prim’s algorithm is significantly faster in the limit with really dense graphs with many more edges than vertices. Kruskal performs better in typical situations (sparse graphs) because it uses simpler data structures.

Single Source Shortest Path

The single-source shortest path problem, in which we have to find shortest paths from a source vertex v to all other vertices in the graph.

The most important algorithms for solving this problem are:

  1. Dijkstra’s algorithm solves the single-source shortest path problem with non-negative edge weight.
  2. Bellman–Ford algorithm solves the single-source problem if edge weights may be negative, but no negative cycles.

Dijkstra’s algorithm

Video explanation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
function Dijkstra(Graph, source):   
::dist[],prev[]
   // This implementation takes in a graph, represented as
   // lists of vertices and edges, and fills two arrays
   // (dist and prev) about the shortest path
   // from the source to each vertex
    dist[source] ← 0
    
    create vertex set Q
    for each vertex v in Graph:           
        if v ≠ source
            dist[v] ← INFINITY                 // Unknown distance from source to v
        prev[v] ← UNDEFINED                    // Predecessor of v
        Q.add_with_priority(v, dist[v])
    
    while Q is not empty:                      // The main loop
        u ← Q.extract_min()                    // Remove and return best vertex
        for each neighbor v of u:              // only v that are still in Q
            alt ← dist[u] + length(u, v) 
            if alt < dist[v]
                dist[v] ← alt
                prev[v] ← u
                Q.decrease_priority(v, alt)
    return dist[], prev[]

Proof of correctness - link

Dijkstra’s original algorithm does not use a min-priority queue and runs in \(O(V^2)\). The implementation based on a min-priority queue implemented by a Fibonacci heap and running in \(O(\|E\|+\|V\|\log \|V\|)\)

Bellman Ford algorithm

Like Dijkstra’s Algorithm, Bellman–Ford proceeds by relaxation, in which approximations to the correct distance are replaced by better ones until eventually reaching the solution. In both algorithms, the approximate distance to each vertex is always an overestimate of the true distance, and is replaced by the minimum of its old value and the length of a newly found path.

However, Dijkstra’s algorithm uses a priority queue to greedily select the closest vertex that has not yet been processed, and performs this relaxation process on all of its outgoing edges; by contrast, the Bellman–Ford algorithm simply relaxes all the edges, and does this \(\|V\|-1\) times, where \(\|V\|\) is the number of vertices in the graph. In each of these repetitions, the number of vertices with correctly calculated distances grows, from which it follows that eventually all vertices will have their correct distances. This method allows the Bellman–Ford algorithm to be applied to a wider class of inputs than Dijkstra.

Video explanation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
function BellmanFord(Graph, source)
   ::dist[],prev[]
   // This implementation takes in a graph, represented as
   // lists of vertices and edges, and fills two arrays
   // (dist and prev) about the shortest path
   // from the source to each vertex
   for each vertex v in vertices:
       dist[v] := INFINITY
       prev[v] := UNDEFINED
   dist[source] := 0
   
   // relax edges repeatedly
   for i from 1 to size(vertices)-1:
       for each edge (u, v) with weight w in edges:
           if dist[u] + w < dist[v]:
               dist[v] := dist[u] + w
               prev[v] := u
   
   // check for negative-weight cycles
   for each edge (u, v) with weight w in edges:
       if dist[u] + w < dist[v]:
           error "Graph contains negative-weight cycle"
   
   return dist[], prev[]

Proof of correctness - link

Bellman Ford algorithm runs in \(O(\|V\|\|E\|)\).

This post is licensed under CC BY 4.0 by the author.