Radix Sort
Another linear time technique is radix sort. It organizes data based on digit position, beginning with the least significant digit and progressing to the most significant digit. Keys are sorted on the i-th digit in each iteration i.
The name comes from the way algorithm works. It sorts the elements by sorting them on their radix. Radix for decimal number is from 0 to 9 (10 digits), radix for alphabets is a to z (26 alphabets).
The radix sort algorithm is a non-comparative sorting algorithm. It avoids comparison by generating and categorizing elements based on their radix. Radix sort can also be applied to strings. It differs from comparison based algorithms such as selection sort, bubble sort, insertion sort etc. in that way.
Radix sort must use a stable sort algorithm to sort keys on position i which means that if integers 356 and 256 are sorted on the second digit, their order must not change. Otherwise, the end output might be a random sequence. After sorting, stable sorting algorithms always retain the relative position of two identical integers.
The number of iterations is determined by the width of the list’s biggest integer.
Assumption: All items in the input array have the same width, and if they don’t, the width of the remaining elements is set to the width of the largest number in the list by padding zeros from the most significant locations. For instance, if the list contains {34, 123, 6}, it is interpreted as {034, 123, 006}
Limitations of Radix Sort
- If the width of all numbers is not same, special care must be taken to check and pad numbers with zeros
- It requires more memory.
- Difficult to write generalized radix sort which can handle all kind of data.
Example of Radix Sort
Problem: Sort array A = {345, 678, 234, 56, 896, 123, 432, 6} using radix sort.
Solution:
Original Data | Sorting on LSB1 | Sorting on LSB2 | Sorting on LSB3 |
345 | 432 | 006 | 006 |
678 | 123 | 123 | 056 |
234 | 234 | 432 | 123 |
056 | 345 | 234 | 234 |
896 | 056 | 345 | 345 |
123 | 896 | 056 | 432 |
432 | 006 | 678 | 678 |
006 | 678 | 896 | 896 |
Example: Sort following sequence of data using radix sort: <165,
958, 635, 694, 480, 637, 598>
Solution:
The largest element in list is 958, and its width is 3. So radix sort algorithm needs to perform three scans on each digit from LSD to MSD.
Input, output of each pass, and final result is shown in following figure
Algorithm of Radix Sort
Radix sort uses the counting sort method in order to sort bigger, multi-digit integers without possibly decreasing performance by expanding the range of keys the algorithm must sort through.
Algorithm RADIX_SORT(A, d)
// A is an array to be sorted
// k is the number of digits in the largest element in A
for i ← 1 to k do
Apply a stable sort (such as counting sort) on digit i of array A
end
Complexity Analysis
- Radix sort iterates the loop k times, in each iteration n numbers are scanned. So the complexity of the algorithm would be O(k.n). If k is small, it can be considered as constant and algorithm runs in linear time i.e. O(n).
- But in general, if k is large, it cannot be considered constant. If the value of element increases with the number of elements n, it takes log2n bits to represent the number less than n. So k = log2n, and algorithm runs in O(k.n) = O(n log2n) time.
LSD vs. MSD Sorting Approach
Radix sorting can begin with either the most significant digit (MSD) or the least significant digit (LSD). For instance, with 6249, one may begin with 6 (MSD) or 9 (LSD).
LSD radix sort generally employs the following sorting order: small keys are sorted first, followed by longer keys, and then keys of the same length are sorted lexicographically. This corresponds to the standard order of integer representations, such as the sequence [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]. LSD types are typically stable.
MSD radix sort is best suited for sorting strings or integer representations with fixed lengths. [b, ba, c, d, e, f, g, ba] would be sorted as [b, ba, c, d, e, f, g]. When using lexicographic ordering to sort variable-length integers in base 10, numbers from 1 to 10 would be printed as [1, 10, 2, 3, 4, 5, 6, 7, 8, 9], as if the shorter keys were left-justified and padded on the right with blank characters to make the shorter keys as long as the longest key. MSD sorts are not always reliable since the original ordering of duplicate keys must always be preserved.
Some Popular Sorting Algorithms
Following are the few popular sorting algorithms used in computer science:
- Bubble sort
- Selection sort
- Insertion sort
- Heap sort
- Merge sort
- Quick sort
- Counting sort
- Radix sort
- Bucket sort
Additional Reading: Wiki