## Introduction to Kruskal’s Algorithm

Kruskal’s Algorithm Problem: Find minimum spanning tree from given weighted, undirected graph G = <V, E, W>

- Joseph Kruskal has suggested a greedy approach to find MST from given weighted, undirected graph.
- The algorithm first sorts all the edges in nondecreasing order of their weight.
- Edge with minimum weight is selected and its feasibility is tested.
- If inclusion of the edge to a partial solution does not form the cycle, then the edge is feasible and added to the partial solution.
- If is not feasible then skip it and check for the next edge. The process is repeated until all edges are scanned.
- Let, list unvisited edges U
_{E}= {e_{1}, e_{2}, e_{3}, . . . , e_{m}}, be the edges sorted by increasing order of their weight.

1. Select minimum weight edge e_{min }from input graph G, which is not already added.

2. If the addition of edge e_{min }to a partial solution does not form a cycle, add it. Otherwise look for next minimum weight edge until we get the feasible edge. Remove checked edges from E.

3. Go to step 1 and repeat the procedure until all edges in E are scanned.

## Algorithm for Kruskal’s Algorithm

Algorithm for Kruskal’s approach is shown below:

**Algorithm **KUSKAL_MST(G)
// Description : Find minimum spanning tree of graph G of n vertices
// Input : Weighted undirected graph G
// Output : Minimum spanning tree of G
MST ← Φ
**for **each v ∈ V **do**
MAKE-SET(v)
**end**
**for each **(u, v) ∈ E ordered by weight in increasing order **do**
**if **FIND-SET(u) ≠ FIND-SET(v) **then**
MST ← MST ∪ (u, v)
UNION(u, v)
**end**
**end**
**Function **MAKE-SET(x)
**if **x is not already present **then**
x.parent ← x
x.rank ← 0
**end**
Function FIND(x)
**if **x.parent ≠ x **then**
x.parent ← FIND(x.parent)
**end**
**return **x.parent
**Function **UNION(x, y)
xRoot ← FIND(x)
yRoot ← FIND(y)
**if **xRoot == yRoot **then**
**return**
**end**
**if **xRoot.rank < yRoot.rank **then**
xRoot.parent ← yRoot
**else if **yRoot.rank < xRoot.rank **then**
yRoot.parent ← xRoot
**else**
xRoot.parent ← yRoot
yRoot.rank ← yRoot.rank + 1
**end**

## Time complexity of Kruskal’s Algorithm

- Using Disjoint-set data structures, with |E| edges and |V| vertices in the graph, Kruskal’s algorithm runs in O(|E| log
_{2}|E|) or O(|E| log_{2}|V|) time, both are identical. For the dense graph, |E| is at most |V^{2}|. So,

log_{2}|E| = log_{2}|V^{2}| = 2log_{2}|V| ≈ log_{2}|V| - Sort edges using comparison based algorithm, which could sort it in O(|E|log
_{2}|E|) time. Selection and feasibility checking of the edge is done in O(|E|) time. Thus total running time is given as

O(|E| + |E| log |E|) = O(|E| log |E|).

## Examples

**Example: Find the cost of Minimal Spanning Tree of the given graph by using Kruskal’s Algorithm**

**Solution:**

- Kruskal’s algorithm sorts the edges according to decreasing order of their weight. Sorted edges are listed in following table:

Edge | <1, 2> | <3, 5> | <1, 3> | <2, 3> | <3, 4> | <4, 5> | <2, 4> |

Cost | 1 | 2 | 3 | 3 | 4 | 5 | 6 |

- One by one edge is added to a partial solution if it is not forming the cycle.
- Initially, set of unvisited edges U
_{E}= {<1, 2>, <3, 5>, <1, 3>, <2, 3>, <3, 4>, <4, 5>, <2, 4>}

Partial solution | Updated U_{E} |

Step 1: Minimum cost edge is <1, 2>, so add it to MST and remove from U_{E} | U_{E} = { <3, 5>, <1, 3>,<2, 3>, <3, 4>, <4, 5>, <2, 4> } |

Step 2: Minimum cost edge is <3, 5>, so add it to MST and remove from U_{E} | U_{E} = { <1, 3>, <2, 3>, <3, 4>, <4, 5>, <2, 4> } |

Step 3: Minimum cost edge is <1, 3>, so add it to MST and remove from U_{E} | U_{E} = { <2, 3>, <3, 4>,<4, 5>, <2, 4> } |

Step 4 : Minimum cost edge is <2, 3>, but its inclusion creates cycle so remove it from U_{E} So, U_{E} = { <3, 4>, <4, 5>, <2, 4> } Next minimum cost edge is <3, 4> and its inclusion does not form cycle, so add it to MST and remove from U _{E} | U_{E} = { <4, 5>, <2, 4> } |

Step 5 : Minimum cost edge is <4, 5>, but its inclusion creates cycle so remove it from U_{E} So, U_{E} = { <2, 4> } Minimum cost edge is <2, 4>, but its inclusion creates cycle so remove it from U _{E} So, U_{E} = { } | U_{E} is empty so the tree generated in step 4 is the minimum spanning tree of given graph. Let w(u, v) represent the weight of edge(u, v) Cost of solution: w(1, 2) + w(1, 3) + w(3, 4) + w(3, 5) = 1 + 3 + 4 + 2 = 10 |

**Example: Find out minimum cost spanning tree using KRUSKAL algorithm**

Edge | Cost | Edge | Cost |

(V_{1},V_{7}) | 1 | (V_{4},V_{5}) | 7 |

(V_{3},V_{4}) | 3 | (V_{1},V_{2}) | 20 |

(V_{2},V_{7}) | 4 | (V_{1},V_{6}) | 23 |

(V_{3},V_{7}) | 9 | (V_{5},V_{7}) | 25 |

(V_{2},V_{3}) | 15 | (V_{5},V_{6}) | 28 |

(V_{4},V_{7}) | 16 | (V_{6},V_{7}) | 36 |

**Solution:**

Graph representation of given data is shown in following Figure.

- Kruskal’s algorithm sorts the edges according to decreasing order of their weight. Sorted edges are listed in the following table:

Edge | <V_{1}, V_{7}> | <V_{3}, V_{4}> | <V_{2}, V_{7}> | <V_{4}, V_{5}> | <V_{3}, V_{7}> | <V_{2}, V_{3}> |

Cost | 1 | 3 | 4 | 7 | 9 | 15 |

Edge | <V_{4}, V_{7}> | <V_{1}, V_{2}> | <V_{1}, V_{6}> | <V_{5}, V_{7}> | <V_{5}, V_{6}> | <V_{6}, V_{7}> |

Cost | 16 | 20 | 23 | 25 | 28 | 36 |

- One by one edge is added to a partial solution if it is not forming the cycle.
- Initially, set of unvisited edges U
_{E}= {<V_{1}, V_{7}>, <V_{3}, V_{4}>, <V_{2}, V_{7}>, <V_{4}, V_{5}>, <V_{3}, V_{7}>, <V_{2}, V_{3}>, <V_{4}, V_{7}>, <V_{1}, V_{2}>, <V_{1}, V_{6}>, <V_{5}, V_{7}>, <V_{5}, V_{6}>, <V_{6}, V_{7}>}

Partial solution | Updated U_{E} |

Step 1: Minimum cost edge is <V_{1}, V_{7}>, so add it to MST and remove from U_{E} | U_{E} = {<V_{3}, V_{4}>, <V_{2}, V_{7}>, <V_{4}, V_{5}>, <V_{3}, V_{7}>, <V_{2}, V_{3}>, <V_{4}, V_{7}>, <V_{1}, V_{2}>, <V_{1}, V_{6}>, <V_{5}, V_{7}>, <V_{5}, V_{6}>, <V_{6}, V_{7}>} |

Step 2: Next minimum cost edge is <V_{3}, V_{4}>, so add it to MST and remove from U_{E} | U_{E} = {<V_{2}, V_{7}>, <V_{4}, V_{5}>, <V_{3}, V_{7}>, <V_{2}, V_{3}>, <V_{4}, V_{7}>, <V_{1}, V_{2}>, <V_{1}, V_{6}>, <V_{5}, V_{7}>, <V_{5}, V_{6}>,<V _{6}, V_{7}>} |

Step 3: Next minimum cost edge is <V_{2}, V_{7}>, so add it to MST and remove from U_{E} | U_{E} = {<V_{4}, V_{5}>, <V_{3}, V_{7}>, <V_{2}, V_{3}>, <V_{4}, V_{7}>, <V_{1}, V_{2}>, <V_{1}, V_{6}>, <V_{5}, V_{7}>, <V_{5}, V_{6}>, <V_{6}, V_{7}>} |

Step 4: Next minimum cost edge is <V_{4}, V_{5}>, so add it to MST and remove from U_{E} | U_{E} = {<V_{3}, V_{7}>, <V_{2}, V_{3}>, <V_{4}, V_{7}>, <V_{1}, V_{2}>, <V_{1}, V_{6}>, <V_{5}, V_{7}>, <V_{5}, V_{6}>, <V_{6}, V_{7}>} |

Step 5: Next minimum cost edge is <V_{3}, V_{7}>, so add it to MST and remove from U_{E} | U_{E} = {<V_{2}, V_{3}>, <V_{4}, V_{7}>, <V_{1}, V_{2}>, <V_{1}, V_{6}>, <V_{5}, V_{7}>, <V_{5}, V_{6}>, <V_{6}, V_{7}>} |

Step 6: Next minimum cost edge is <V_{2}, V_{3}>, but its inclusion creates cycle so remove it from U_{E}. | U_{E} = {<V_{4}, V_{7}>, <V_{1}, V_{2}>, <V_{1}, V_{6}>, <V_{5}, V_{7}>, <V_{5}, V_{6}>, <V_{6}, V_{7}>} |

Step 7: Similarly, next minimum cost edges are <V_{4}, V_{7}> and <V_{1}, V_{2}> form cycle so remove them from U_{E}. | U_{E} = {<V_{1}, V_{6}>, <V_{5}, V_{7}>, <V_{5}, V_{6}>, <V_{6}, V_{7}>} |

Next minimum cost edge is <VStep 8: _{1}, V_{6}>, so add it to MST and remove from U_{E} | U_{E} = {<V_{5}, V_{7}>, <V_{5}, V_{6}>, <V_{6}, V_{7}>} |

Step 9: Next minimum cost edge is <V_{5}, V_{7}>, but its inclusion creates cycle so remove it from U_{E}. So, U_{E} = {<V_{5}, V_{6}>, <V_{6}, V_{7}>} Similarly, next minimum cost edges <V _{5}, V_{6}> and <V_{6}, V_{7}> form cycle so remove them from U_{E}. So, U_{E} = { } | U_{E} is empty so the tree generated in step 4 is the minimum spanning tree of given graph. Let w(u, v) represent the weight of edge(u, v) Cost of solution: w(V _{1}, V_{6}) + w(V_{1}, V_{7}) + w(V_{2}, V_{7}) + w(V_{7}, V_{3}) + w(V_{3}, V_{4}) + w(V_{4}, V_{5}) = 23 + 1 + 4 + 9 + 3 + 7 = 47 |

## Some Popular Problems Solved by Greddy Algorithm

Greedy algorithms are used to find an optimal or near-optimal solution to many real-life problems. A few of them are listed below :

- Binary Knapsack Problem
- Fractional Knapsack Problem
- Job Scheduling Problem
- Activity Selection Problem
- Huffman Coding
- Optimal Storage on Tapes
- Optimal Merge Pattern
- Prim’s Algorithm
- Kruskal’s Algorithm
- Dijkstra’s Algorithm

Additional Reading: Wiki