Radix Sort

Another linear time technique is radix sort. It organizes data based on digit position, beginning with the least significant digit and progressing to the most significant digit. Keys are sorted on the i-th digit in each iteration i.

The name comes from the way algorithm works. It sorts the elements by sorting them on their radix. Radix for decimal number is from 0 to 9 (10 digits), radix for alphabets is a to z (26 alphabets).

The radix sort algorithm is a non-comparative sorting algorithm. It avoids comparison by generating and categorizing elements based on their radix. Radix sort can also be applied to strings. It differs from comparison based algorithms such as selection sort, bubble sort, insertion sort etc. in that way.

Radix sort must use a stable sort algorithm to sort keys on position i which means that if integers 356 and 256 are sorted on the second digit, their order must not change. Otherwise, the end output might be a random sequence. After sorting, stable sorting algorithms always retain the relative position of two identical integers.

The number of iterations is determined by the width of the list’s biggest integer.

Assumption: All items in the input array have the same width, and if they don’t, the width of the remaining elements is set to the width of the largest number in the list by padding zeros from the most significant locations. For instance, if the list contains {34, 123, 6}, it is interpreted as {034, 123, 006}

Limitations of Radix Sort

  • If the width of all numbers is not same, special care must be taken to check and pad numbers with zeros
  • It requires more memory.
  • Difficult to write generalized radix sort which can handle all kind of data.

Example of Radix Sort

Problem: Sort array A = {345, 678, 234, 56, 896, 123, 432, 6} using radix sort.

Solution:

Original DataSorting on LSB1Sorting on LSB2Sorting on LSB3
345432006006
678123123056
234234432123
056345234234
896056345345
123896056432
432006678678
006678896896

Example: Sort following sequence of data using radix sort: <165,958, 635, 694, 480, 637, 598>

Solution:

The largest element in list is 958, and its width is 3. So radix sort algorithm needs to perform three scans on each digit from LSD to MSD.

Input, output of each pass, and final result is shown in following figure

Simulation of radix sort
Simulation of radix sort

Algorithm of Radix Sort

Radix sort uses the counting sort method in order to sort bigger, multi-digit integers without possibly decreasing performance by expanding the range of keys the algorithm must sort through.

Algorithm RADIX_SORT(A, d)
// A is an array to be sorted
// k is the number of digits in the largest element in A

for i ← 1 to k do
  Apply a stable sort (such as counting sort) on digit i of array A
end

Complexity Analysis

  • Radix sort iterates the loop k times, in each iteration n numbers are scanned. So the complexity of the algorithm would be O(k.n). If k is small, it can be considered as constant and algorithm runs in linear time i.e. O(n).
  • But in general, if k is large, it cannot be considered constant. If the value of element increases with the number of elements n, it takes log2n bits to represent the number less than n. So k = log2n, and algorithm runs in O(k.n) = O(n log2n) time.

LSD vs. MSD Sorting Approach

Radix sorting can begin with either the most significant digit (MSD) or the least significant digit (LSD). For instance, with 6249, one may begin with 6 (MSD) or 9 (LSD).

LSD radix sort generally employs the following sorting order: small keys are sorted first, followed by longer keys, and then keys of the same length are sorted lexicographically. This corresponds to the standard order of integer representations, such as the sequence [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]. LSD types are typically stable.

MSD radix sort is best suited for sorting strings or integer representations with fixed lengths. [b, ba, c, d, e, f, g, ba] would be sorted as [b, ba, c, d, e, f, g]. When using lexicographic ordering to sort variable-length integers in base 10, numbers from 1 to 10 would be printed as [1, 10, 2, 3, 4, 5, 6, 7, 8, 9], as if the shorter keys were left-justified and padded on the right with blank characters to make the shorter keys as long as the longest key. MSD sorts are not always reliable since the original ordering of duplicate keys must always be preserved.

Following are the few popular sorting algorithms used in computer science:


Additional Reading: Wiki

Leave a Reply

Your email address will not be published. Required fields are marked *