Heapsort is an in-place comparison based sorting algorithm. It may be thought of as a better form of selection sort. Heap sort, like selection sort, splits the list into two sub-lists: sorted and unsorted. It brings the largest element at the end of the sorted sub-list in each iteration. The sorted sub-list steadily grows while the unsorted list shrinks. Heapsort is not stable algorithm.

In computer programming, Heap Sort is a common and efficient sorting method. To learn how to construct the heap sort algorithm, you must first understand two types of data structures: arrays and trees. Set of numbers are initially represented in array such as <33, 55, 11, 22, 66, 44>, after applying sorting, it is stored in array with order <11, 22, 33, 44, 55, 66>.

Heap sort works by visualizing the array items as a specific type of full binary tree known as a heap.

Relation Between Array Index and Heap

Heap sorting is done in two phases. It constructs the max-heap from the input sequence in the first stage, and then moves the biggest element to its right place and reconstructs the heap in the second step. Heap can be represented as a full binary tree that satisfies some condition. Nodes are added from left to right in the heap data structure at any level L. As a result, if the node has a right child, it must also have a left child.

For array starting with index 1, LEFT, RIGHT and PARENT node of any node i is given by :

PARENT(i) = floor(i/2)

LEFT(i) = 2i

RIGHT(i) = 2i + 1

Note :   However, the above functions are not valid for array starting with index 0

Heap representation of an array
Heap representation of an array

Heap Data Structure

  • It is a complete binary tree data structure
  • Except for the last level, all levels are entirely filled
  • Heap believes that there should be some sort of hierarchy of values maintained between parents and their offspring

There are two types of heap tree. Types and properties are listed below :

Max heap:

  • In a max heap, the value of the parent element is greater than its child node.
  • A[PARENT[i]] ≥ A[i]
  • The largest element is at the root
  • Max heap is used in heap sort

Min heap:

  • In min-heap, the value of the parent element is less than its child node
  • A[PARENT[i]] ≤ A[i]
  • The smallest element is at the root
  • Min heap is used in a priority queue
Max heap
Max heap
Mean Heap
Mean Heap

Example of Heap Sort

Example 1: Sort given data using heap sort: <17, 32, 45, 76, 40, 13, 55, 30, 7, 11>

Solution:

Heap sort works in two stages : 

  • Build Max heap
  • Sorting

STAGE 1: Build max heap

Here, number of elements n = 9, so i = ceil(n/2) – 1 = ceil(10/2) – i = 4

Step by step process of constructing max heap is shown below :

Step 1: Construct complete binary tree.

Start the heapify process from node i = 4 and decrement i in each iteration.

After each swap, check the max heap property of swapped nodes.

Heap Sort

Step 2 :   For node having index i=4, check if it satisfies the max heap property. Here node 4 has value 76, which is greater than both of its child, which satisfies the max heap property.                

Decrement i: i = i – 1 = 3

Heap Sort

Step 3 :   Check if the 3rd node satisfies the max heap property.

Node 3 has value 45, which is less than its right child, so swap them.

Decrement i: i = i – 1 = 2

Heap Sort

Step 4 :   Check if the 2nd node satisfies the max heap property.

Node 2 has value 32, which is less than both of its child. Node 2 will be swapped with the child node with the highest value. So swap node value 32 and 76.

Decrement i: i = i – 1 = 1

Step 5: Check if the 1st node satisfies the max heap property. Node 1 has value 17, which is less than both of its child. Node 1 will be swapped with the child node with the highest value. So swap node value 17 and 76.

Step 6: Now, node 2 violates the max heap property, so swap it with node 5, as in both children, node 4 has a higher value

Step 7 : All node satisfies the max heap property, so this is the desired max-heap.

STAGE 2: Sorting

After constructing max heap, we shall run the heap sort algorithm on this heap. In each iteration, the value of the 1st node will be swapped with the last node in the unsorted node and heap will be adjusted again. This process will be repeated n – 1 time. Each pass will put the largest element at the end of the unsorted list.

Step 1 : Swap A[1] with A[10], and heapify the tree again.

Heap Sort

Step 2 : Swap A[1] with A[9], and heapify the tree again.

Heap Sort

Step 3 : Swap A[1] with A[8], and heapify the tree again

Heap Sort

Step 4 : Swap A[1] with A[7], and heapify the tree again.

Step 5 : Swap A[1] with A[6], and heapify the tree again.

Step 6 : Swap A[1] with A[5], and heapify the tree again.

Step 7 : Swap A[1] with A[4], and heapify the tree again.

Step 8 : Swap A[1] with A[3], and heapify the tree again

Step 9 : Swap A[1] with A[2], and heapify the tree again.

Step 10: Read elements in BFS order. They are sorted.

Simulation of Heap Sort

Following simulation explains the entire working process of heap sort.

Simulation of heap sort
Simulation of heap sort

Image Source: https://en.wikipedia.org/wiki/Heapsort [Link]

Algorithm of Heap Sort

Algorithm for heap sort is described below :

Algorithm HEAP_SORT(A)
// A is an array of size n

BUILD_MAX_HEAP(A)
for i ← n downto 2 do
	Swap(A[1], A[i])
	Heap_Size[A] ← Heap_Size[A] – 1
	MAX_HEAPIFY(A, 1)
end

Building Max Heap:

  • This is in process of setting the parent-child hierarchy throughout the heap.
  • This is the initial function that is called in order to construct the heap.
  • It begins with the tree’s last parent since it is the first instance where the order may be disturbed.
  • As a result, it iterates from i=n/2-1 to 0 and calls HEAPIFY for each parent node.

BUILD_MAX_HEAP routine works as follow:

BUILD_MAX_HEAP(A)
// Routine to build initial Max Heap

Heap_Size ← length(A)
for i ← ceil(n/2) downto 1 do
	MAX_HEAPIFY(A, i)
end

Heapify:

  • This operation determines the value order between the parent and its child.
  • If we’re working with the max heap, it will look for the index with the highest value among the node and its children.
  • If the index with the highest value is not the parent, it will swap the parent with the child with the highest value.

MAX_HEAPIFY works as follow:

MAX_HEAPIFY(A, i)
L ← LEFT(A)
R ← RIGHT(A)
if L ≤ Heap_Size(A) and A[L] > A[i] then
	Largest ← L
else
	Largest ← i
end

if R ≤ Heap_Size(A) && A[R] > A[Largest] do
	Largest ← R
end

if Largest ≠ i do
	swap(A[i], A[Largest])
	MAX_HEAPIFY(A, Largest)
end

Complexity Analysis

  • HEAPIFY routine builds a binary tree, so the running time of it is O(log2n). BUILD_MAX_HEAP routine runs in linear time with complexity O(n). MAX_HEPIFY routine is called n times from HEAP_SORT, so overall running time of heap sort algorithm is O(nlog2n).
  • There is no mechanism to detect whether the list is sorted or not. For any sequence of data, heap sort dose the same work. So time complexity of heap sort is the same in all three cases. Hence heap sort is not adaptive.
  • Heapsort is the in-place algorithm, the space complexity of the algorithm is O(1).

The time complexity of heap sort for all three cases is mentioned in the following table :

Best CaseAverage caseWorst case
O(nlog2n)O(nlog2n)O(nlog2n)

Worst case of heap sort is better than the average case of selection sort and bubble sort. How ever best case of insertion sort and bubble sort beats the best case of heap sort.

Discussion and Comparison

Quicksort, another extremely efficient general purpose in-place comparison-based sort algorithm, is the primary competitor of heapsort.

Advantages:

  • Simple
  • Non recursive
  • Low auxiliary storage requirement
  • Consistently high performance: its best and worst cases are within a tiny constant factor of each other, as well as the theoretical lower bound for comparison sorts. 

Disadvantages:

  • Poor locality of reference
  • Inherently serial in nature

As a result, it is widely used in embedded systems, real-time computing, and systems that deal with maliciously selected inputs, such as the Linux kernel.

Quicksort is often 2–3 times quicker than heapsort when properly implemented. Quicksort has a significant benefit in terms of locality of reference: partitioning is a linear scan with high spatial locality, and recursive subdivision has good temporal locality. Quicksort may be done in essentially branch-free code with additional effort, and several CPUs can be utilized to sort sub partitions in concurrently.

Following are the few popular sorting algorithms used in computer science: