Optimal Merge Pattern
Optimal Merge Pattern Problem: “Merge n sorted sequences of different lengths into one sequence while minimizing reads”. Any two sequences can be merged at a time. At each step, the two shortest sequences are merged.
Consider three sorted lists L1, L2 and L3 of size 30, 20 and 10 respectively. Two way merge compares elements of two sorted lists and put them in new sorted list. If we first merge list L1 and L2, it does 30 + 20 = 50 comparisons and creates a new array L’ of size 50. L’ and L3 can be merged with 60 + 10 = 70 comparisons that forms a sorted list L of size 60. Thus total number of comparisons required to merge lists L1, L2 and L3 would be 50 + 70 = 120.
Alternatively, first merging L2 and L3 does 20 + 10 = 30 comparisons, which creates sorted list L’ of size 30. Now merge L1 and L’, which does 30 + 30 = 60 comparisons to form a final sorted list L of size 60. Total comparisons required to create L is thus 30 + 60 = 90.
In both the cases, final output is identical but first approach does 120 comparisons whereas second does only 90. Goal of optimal merge pattern is to find the merging sequence which results into minimum number of comparisons.
Let S = {s1, s2, …, sn} be the set of sequences to be merged. Greedy approach selects minimum length sequences si and sj from S. The new set S’ is defined as, S’ = (S – {si, sj}) ∪ {si + sj}. This procedure is repeated until only one sequence is left.
The working principle of optimal merge patterns is similar to Huffman coding tree construction. This can easily be done with min-heap data structure. In min heap, parent is always smaller than its child’s. Thus root of min-heap always contains minimum element. Using min heap we can retrieve minimum element in constant time. And insertion in min heap takes O(logn) time.
Algorithm for Optimal Merge Pattern
Algorithm for optimal merge patterns is shown below :
Algorithm OPTIMAL_MERGE_PATTERNS(S)
// S is set of sequences
Create min heap H from S
while H.length > 1 do
min1 ← minDel(H)
// minDel function returns minimum element from H and delete it from H
min2 ← minDel(H)
NewNode.Data ← min1 + min2
NewNoode.LeftChild ← min1
NewNode.RightChild ← min2
Insert(NewNode, H) // Insert node NewNode in heap H
end
Complexity Analysis
In every iteration, two delete minimum and one insert operation is performed. Construction of heap takes O(logn) time. The total running time of this algorithm is O(nlogn).
T(n) = O(n – 1) * max(O(findmin), O(insert))
Case 1 : If list is not sorted :
O(findmin) = O(n)
O(insert) = O(1)
So, T(n) = (n – 1) * n = O(n2)
Case 2 : If list is sorted :
Case 2.1 : List is represented as an array
O(findmin) = O(1)
O(insert) = O(n)
So, T(n) = (n – 1) * n = O(n2)
Case 2.2 : List is represented as min-heap
O(findmin) = O(1)
O(insert) = O(logn)
So, T(n) = (n – 1) * logn = O(nlogn)
Example
Example: Consider the sequence {3, 5, 9, 11, 16, 18, 20}. Find optimal merge patter for this data
Solution:
At each step, merge the two smallest sequences
Step 1:
Given sequence is,
Step 2: Merge the two smallest sequences and sort in ascending order
Step 3: Merge the two smallest sequences and sort in ascending order
Step 4: Merge the two smallest sequences and sort in ascending order
Step 5: Merge the two smallest sequences and sort in ascending order
Step 6: Merge the two smallest sequences and sort in ascending order
Step 7: Merge the two smallest sequences and sort in ascending order
Total time = 8 + 17 + 27 + 35 + 47 + 82 = 216
Some Popular Problems Solved by Greddy Algorithm
Greedy algorithms are used to find an optimal or near-optimal solution to many real-life problems. A few of them are listed below :
- Binary Knapsack Problem
- Fractional Knapsack Problem
- Job Scheduling Problem
- Activity Selection Problem
- Huffman Coding
- Optimal Storage on Tapes
- Optimal Merge Pattern
- Prim’s Algorithm
- Kruskal’s Algorithm
- Dijkstra’s Algorithm
Additional Reading; Read More