Bucket Sort
Bucket sort is also known as bin sort. It is a sorting algorithm that divides an array’s items into a number of buckets. The buckets are then sorted one at a time, either using a separate sorting method or by recursively applying the bucket sorting algorithm.
Assumption : Input data is generated by some random process and uniformly distributed in the range of [0, 1).
Bucket sort sorts the data by partitioning them into n buckets. Each number is stored in an appropriate bucket. Numbers in a bucket are sorted using some linear time algorithm. Inputs are uniformly distributed so it is less likely to fall many numbers in a single bucket.
Bucket sort operates in two stages:
- Scatter: Distribute all the elements in their respective buckets. Sort all the buckets
- Gather: Visit all buckets in order and put all the data in original array
Example of Bucket Sort
Example: Trace bucket sort with following data {0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68}.
Solution:
Ten bins/buckets are created and elements of array A is inserted in the appropriate bin number integer of [n*A[i]]. Following diagram shows the Input data pattern and sorted bins. Insert element in a linked list using insertion sort.
Read values from each bucket in same order, so sorted data would be B {0.12, 0.17, 0.21, 0.23, 0.28, 0.39, 0.68, 0.72, 0.78, 0.94}.
Example: Sort the array A {0.23, 0.65, 0.11, 0.37, 0.45, 0.63, 0.18, 0.87, 0.92, 0.25} using bucket sort.
Solution:
Ten buckets are created and elements of array A is inserted in the appropriate bucket number integer of [n*A[i]]. Following diagram shows the Input data pattern and sorted buckets. Insert element in a linked list using insertion sort.
Read values from each bucket in the same order, so sorted data would be B{0.11, 0.18, 0.23, 0.25, 0.37, 0.45, 0.63, 0.65, 0.87, 0.92}
Algorithm of Bucket Sort
Algorithm BUCKET_SORT(A)
// A is an array of size n
Create n buckets represented by B
for i ← 1 to n do
Insert A[i] into bucket B[n*A[i]]
end
for i ← 1 to n do
sort each bucket i using insertion sort
end
Concate all buckets into sorted order
Complexity Analysis
- This technique is most useful when the input is evenly spread throughout a range.
- Adjacent keys (elements with almost similar values) are placed in one bucket, which makes distribution uneven. Some buckets will contain more elements, while others will have fewer elements.
- The worst case occurs when all elements create a tight cluster and fall into the same bucket. In this case, the running time of the algorithm would be O(n2)
- If the size of each bucket is one, the performance of bucket sort reduces to counting sort.
- In general, the time complexity of bucket sort is O(n + m), where n is the number of elements in the array and m is the range of input values.
Advantages of Bucket Sort
- Because of the uniform distribution, it is asymptotically fast.
- It reduces the number of comparisons.
- When compared to bubble sort, it is faster.
Limitations of Bucket Sort
- It is not suitable for sorting arbitrary strings.
- It is only good for sorting data uniformly distributed over the range [0, 1].
- It is not an in-place sorting because the buckets require additional space to be sorted.
- It is ineffective if we have a huge array since it increases the cost.
- Prior knowledge of the greatest element is required.
Some Popular Sorting Algorithms
Following are the few popular sorting algorithms used in computer science:
- Bubble sort
- Selection sort
- Insertion sort
- Heap sort
- Merge sort
- Quick sort
- Counting sort
- Radix sort
- Bucket sort
Additional Reading: Visualization