Counting Sort
Counting sort is quite old sorting method. It was introduced by Harold Seward in 1954. It is non comparison based, non in-place sorting algorithm. It sorts data based on the frequency of the elements. it is an integer sorting algorithm.
Assumption: Input to the algorithm is non negative integers in the range of 1 to k.
Counting sort counts the number of occurrences of integer elements in an unsorted list to sort the supplied input pattern. Count sort makes use of three arrays.
- One of these is the n-dimensional input array A[1….n].
- The second array is an output array of size n. T
- he third array is a temporary storage array of size k, which is used to store the count of each element in the input array. In this case, k is the largest integer number in the input array.
The objective behind counting sort is to determine the number of elements that are fewer than x for each input element x. This information is used to place element x in the right location in the output array. If five elements are less than x, then x must be assigned to the sixth place in the output array.
There is a tie if two items have the same value. This case is addressed in a unique manner. The counting sort algorithm is a reliable one. In the event of a tie between two identical integers, the number that appears first in the input array appears first in the output array.
Example
Sort array A = {2, 5, 4, 2, 3, 4, 2, 0} using counting sort.
Solution:
Initially, we have,
Step 1 : Count occurrences of each integer in array A and store in C. Element 0 course ones, element 1 does not occur at all, element 2 occurs thrice and so on.
Step 2 : Find the running sum of the count of occurrence in array C
Step 3 : For j = length(A) = 8.
B[C[A[j]]] = A[j]
⇒ B[C[A[8]]] = A[8]
⇒ B[C[0]] = 0
⇒ B[1] = 0
C[A[j]] = C[A[j]] – 1
⇒ C[A[8]] = C[A[8]] – 1
⇒ C[0] = C[0] – 1
⇒ C[0] = 1 – 1 = 0
Step 4 : For j = j – 1 = 8 – 1 = 7.
B[C[A[j]]] = A[j]
⇒ B[C[A[7]]] = A[7]
⇒ B[C[2]] = 2
⇒ B[4] = 2
C[A[j]] = C[A[j]] – 1
⇒ C[A[7]] = C[A[7]] – 1
⇒ C[2] = C[2] – 1
⇒ C[2] = 4 – 1 = 3
Step 5 : For j = j – 1 = 7 – 1 = 6.
B[C[A[j]]] = A[j]
⇒ B[C[A[6]]] = A[6]
⇒ B[C[4]] = 4
⇒ B[7] = 4
C[A[j]] = C[A[j]] – 1
⇒ C[A[6]] = C[A[6]] – 1
⇒ C[4] = C[4] – 1
⇒ C[4] = 7 – 1 = 6
Step 6 : For j = j – 1 = 6 – 1 = 5.
B[C[A[j]]] = A[j]
⇒ B[C[A[5]]] = A[5]
⇒ B[C[3]] = 3
⇒ B[5] = 3
C[A[j]] = C[A[j]] – 1
⇒ C[A[5]] = C[A[5]] – 1
⇒ C[3] = C[3] – 1
⇒ C[3] = 5 – 1 = 4
Step 7 : For j = j – 1 = 5 – 1 = 4.
B[C[A[j]]] = A[j]
⇒ B[C[A[4]]] = A[4]
⇒ B[C[2]] = 2
⇒ B[3] = 2
C[A[j]] = C[A[j]] – 1
⇒ C[A[4]] = C[A[4]] – 1
⇒ C[2] = C[2] – 1
⇒ C[2] = 3 – 1 = 2
Step 8 : For j = j – 1 = 4 – 1 = 3.
B[C[A[j]]] = A[j]
⇒ B[C[A[3]]] = A[3]
⇒ B[C[4]] = 4
⇒ B[6] = 4
C[A[j]] = C[A[j]] – 1
⇒ C[A[3]] = C[A[3]] – 1
⇒ C[4] = C[4] – 1
⇒ C[4] = 6 – 1 = 5
Step 9 : For j = j – 1 = 3 – 1 = 2.
B[C[A[j]]] = A[j]
⇒ B[C[A[2]]] = A[2]
⇒ B[C[5]] = 5
⇒ B[8] = 5
C[A[j]] = C[A[j]] – 1
⇒ C[A[2]] = C[A[2]] – 1
⇒ C[5] = C[5] – 1
⇒ C[5] = 8 – 1 = 7
Step 10 : For j = j – 1 = 2 – 1 = 1.
B[C[A[j]]] = A[j]
⇒ B[C[A[1]]] = A[1]
⇒ B[C[2]] = 2
⇒ B[2] = 2
C[A[j]] = C[A[j]] – 1
⇒ C[A[1]] = C[A[1]] – 1
⇒ C[1] = C[1] – 1
⇒ C[1] = 1 – 1 = 0
Array B contains the final sorted sequence.
Algorithm of Counting Sort
Algorithm COUNTING_SORT (A, B, k)
// A and B are input and output array respectively, each of size n
// k is the largest element in input array A
// C is a temporary array, which holds the count of each distinct element
for i ← 0 to k do
C[i] ← 0 // Initialize elements of C to zero
end
for j ← 1 to n do
C[A[j]] ← C[A[j]] + 1 // Count occurrences of each integer
end
for i ← 1 to k do
C[i] ← C[i] + C[i-1] // Finds cumulative sum. C[i] contains number of elements <= i
end
for j ← n to 1 do
B[C[A[j]]] ← A[j]
C[A[j]] ← C[A[j]] – 1
end
Complexity Analysis
The method is simple to understand since it utilizes just basic for loops with no recursion or function calls.
Execution time of counting sort is determined by four for loops.
- The first for loop sets the members of array C to zero.
- Second for loop examines the input element; if the value of an input element is j, C[j] is increased by one.
- The third for loop computes the cumulative total of the input element’s occurrences, i.e. how many input items are less than or equal to i.
- The last for loop sorts the array by relocating each member A[j] to its proper location in output array B.
The array C initialization and the third for loop, which performs a prefix/cumulative/running sum of the array C, each iterate at most k+1 times and hence consume O(k) time. The other two for loops take O(n) time. Thus, the total time taken by the whole algorithm is O(n + k).
For the same reason, the time complexity in average n worst case remains the same and that is O(n). It is most efficient when there is a situation that becomes k<<n. In that case, it becomes linear.
Counting sort is not an in-place algorithm. It needs two extra arrays B and C of size n and k respectively. As we saw in pseudo-code both loop executes once. So its space complexity is O(n + k).
Complexity of counting sort in summarized in following table:
Best case | Average case | Worst case |
O(n + k) | O(n + k) | O(n + k) |
Simulation of Counting Sort
Counting sort first counts the occurrence of all unique numbers. Sorting is then done based on number of elements. Following simulation shows how sorting is done in its simplistic way.
Image Source: https://c.tenor.com/zswbYsLbYqEAAAAd/counting-sort.gif [link]
Discussion and Comparison
Its running time is proportional to the number of items and the difference between the maximum and lowest key values, therefore it is only appropriate for direct use in cases where the difference in keys is not considerably higher than the number of items.
It is frequently used as a subroutine in radix sort, a sorting algorithm that can handle bigger keys more efficiently.
Counting sort is not in-place algorithm. It requires output array of size n, and count array of size k, which leads to additional requirement of O(n + k) auxiliary memory.
Counting sort preserves the relative order of identical keys, so it is stable algorithm
Counting sort is specifically useful in following scenarios:
- Linear complexity is needed
- Smaller integers with multiple count
- Can be used as a subroutine in Radix sort
Some Popular Sorting Algorithms
Following are the few popular sorting algorithms used in computer science: