Sorting algorithms are a crucial part of programming, and choosing the right one for your data is essential for optimal performance. However, even simple algorithms like Bubble Sort can be improved to handle larger datasets more efficiently. In this post, we’ll explore a few ways to optimize the classic Bubble Sort algorithm, using a PlayBASIC example to demonstrate the improvements.
Understanding Bubble Sort
Bubble Sort is one of the most commonly taught sorting algorithms in programming. It’s simple to understand but can be slow for large datasets. The concept is straightforward: you iterate through the data, comparing adjacent elements, and swap them if they are in the wrong order. The process repeats until no swaps are necessary, meaning the array is sorted.
The key flaw of Bubble Sort is that it’s an "n-squared" algorithm, meaning its performance degrades rapidly as the number of elements in the array increases. Despite this, there are still a few optimizations we can apply to make it faster in certain situations.
Optimizing Bubble Sort
While Bubble Sort will never be the fastest sorting algorithm, there are ways to make it more efficient for specific datasets. Below are a couple of key improvements that can help speed up the process.
1. Reduce the Set Size After Each Pass
One improvement involves reducing the size of the array that’s being processed after each pass. As each pass moves the largest remaining element to the end of the array, you don’t need to check it again in subsequent passes. By decreasing the range of elements to check after each pass, you can reduce unnecessary comparisons and speed up the sorting process.
2. Bi-directional Bubble Sort
Instead of only iterating left to right, the Bi-directional Bubble Sort (also known as Cocktail Shaker Sort) goes through the array in both directions. The first pass moves the largest element to the end of the array (just like the classic version), but the next pass moves the smallest element to the beginning of the array. By alternating directions, this approach can reduce the number of passes needed to sort the data.
Example Code in PlayBASIC
Here’s an example implementation of these optimizations in PlayBASIC, which demonstrates the classic Bubble Sort alongside the faster variants:
loadfont "Courier New", 1, 24
MaxItems = 500
DIM Table(MaxItems)
DIM Stats#(10, 5)
DO
Cls
inc frames
Seed = Timer()
Test = 1
SeedTable(Seed, MaxItems)
StartInterval(0)
ClassicBubbleSort(MaxItems)
tt1 + = EndInterval(0)
test = Results("Classic Bubble Sort:", Test, MaxItems, Tt1, Frames)
SeedTable(Seed, MaxItems)
StartInterval(0)
ClassicBubbleSortFaster(MaxItems)
tt2 + = EndInterval(0)
test = Results("Classic Bubble Sort Faster:", Test, MaxItems, TT2, Frames)
SeedTable(Seed, MaxItems)
StartInterval(0)
BiDirectionalBubbleSort(MaxItems)
tt3 + = EndInterval(0)
test = Results("BiDirectional Bubble Sort:", Test, MaxItems, Tt3, Frames)
Sync
REPEAT
UNTIL enterkey() = 0
LOOP
FUNCTION ShowTable(items)
t$ = ""
n = 0
FOR lp = 0 to items
T$ = t$ + str$(table(lp)) + ", "
inc n
IF n > 10
t$ = Left$(t$, Len(t$) - 1)
print t$
t$ = ""
n = 0
ENDIF
NEXT lp
IF t$ < > "" THEN print Left$(t$, Len(t$) - 1)
ENDFUNCTION
FUNCTION SeedTable(Seed, Items)
Randomize seed
FOR lp = 0 to Items
Table(lp) = Rnd(32000)
NEXT lp
ENDFUNCTION
FUNCTION ValidateTable(Items)
result = 0
FOR lp = 0 to items - 1
IF Table(lp) > Table(lp + 1)
result = 1
exit
ENDIF
NEXT lp
ENDFUNCTION Result
FUNCTION Results(Name$, index, Items, Time, Frames)
` Total Time
Time = Time / 1000
Stats#(index, 1) = Stats#(index, 1) + time
print "Sort Type:" + name$
print "Total Time:" + str$(Stats#(index, 1))
print "Average Time:" + str$(Stats#(index, 1) / frames)
IF ValidateTable(Items) = 0
Print "Array Sorted"
ELSE
print "NOT SORTED - ERROR"
ENDIF
print ""
inc index
ENDFUNCTION index
FUNCTION ClassicBubbleSort(Items)
Flag = 0
REPEAT
Done = 0
FOR lp = 0 to items - 1
IF Table(lp) > Table(lp + 1)
done = 1
t = Table(lp)
Table(lp) = Table(lp + 1)
Table(lp + 1) = t
ENDIF
NEXT lp
UNTIL done = 0
ENDFUNCTION
FUNCTION ClassicBubbleSortFaster(Items)
Flag = 0
REPEAT
Done = 0
dec items
FOR lp = 0 to items
IF Table(lp) > Table(lp + 1)
done = 1
t = Table(lp)
Table(lp) = Table(lp + 1)
Table(lp + 1) = t
ENDIF
NEXT lp
UNTIL done = 0
ENDFUNCTION
FUNCTION BiDirectionalBubbleSort(Items)
First = 0
Last = Items
REPEAT
Done = 0
dec Last
FOR lp = First to Last
V = Table(lp + 1)
IF Table(lp) > V
done = 1
Table(lp + 1) = Table(lp)
Table(lp) = v
ENDIF
NEXT lp
IF Done = 1
Done = 0
inc First
FOR lp = Last to First step - 1
V = Table(lp - 1)
IF V > Table(lp)
Done = 1
Table(lp - 1) = Table(lp)
Table(lp) = v
ENDIF
NEXT lp
ENDIF
UNTIL Done = 0
ENDFUNCTION
Explanation of the Code
- Table Initialization: We start by defining an array (`Table`) and filling it with random numbers using the `SeedTable` function.
- Sorting Functions: Three sorting functions are defined:
- `ClassicBubbleSort`: The traditional Bubble Sort that compares adjacent elements and swaps them.
- `ClassicBubbleSortFaster`: This is an optimized version of the classic algorithm where we reduce the set size after each pass.
- `BiDirectionalBubbleSort`: This method sorts the array by alternating the direction of passes, improving performance.
- Performance Tracking: The sorting times are tracked using `StartInterval` and `EndInterval`, allowing us to compare the performance of each sorting method.
Results and Performance
After running the sorting methods, we display the results, including the total time taken and the average time per frame. We also validate that the array is correctly sorted at the end of each method.
The results can vary depending on the size of the dataset, but in most cases, the optimized versions of Bubble Sort will show significant performance improvements compared to the classic method.
Final Thoughts
While Bubble Sort is not the most efficient sorting algorithm, these optimizations provide a good demonstration of how you can improve its performance in certain scenarios. Reducing the size of the set and implementing bi-directional sorting can make the classic Bubble Sort more practical for moderate-sized datasets.
However, if you’re dealing with larger datasets, it’s often better to use more advanced sorting algorithms like Merge Sort or Quick Sort, which offer much better performance.
As always, the key takeaway is that sorting is situational, and selecting the right algorithm for your data is essential. These optimizations are not a silver bullet but can provide useful improvements in the right circumstances.
Have Fun with Sorting!
Sorting is a fundamental concept in computer science, and experimenting with different algorithms and optimizations can help you understand how they work. Feel free to try out these optimizations in your own projects and see how they perform with your data!