XSort is an ongoing project of mine which implements a number of sorting algorithms in the D programming language. The Timsort module was implemented (with some minor revisions) in Phobos to replace the old stable sort. I’m writing this post to detail what future plans I have for the project. Items in bold are high priority. Items in italics are unofficial, meaning I’m considering them and I may or may not follow through on those plans.
- Enhance unstable sort in Phobos
- Implement Three-Pivot Quick Sort (see details below)
- Implement stable partition3 / topN functions (see details below)
- Implement smoothsort
- Implement concurrent comb sort
- Migrate “stable sort” module to merge sort
- Migrate “unstable sort” and “stable quick sort” modules to single “quick sort” module
- Write detailed documentation in modules which can be generated using DDoc
- Group all modules under an “xsort” package
- Add attributes (e.g. @stable @unstable)
Three-Pivot Quick Sort
I’m considering this as a possible candidate and contender for the unstable sort in Phobos. A person in the D forums suggested a two-pivot quick sort which I originally thought would increase the number of comparisons. I realized I was incorrect because, when utilizing multiple pivots, it would result in smaller partitions which would balance out the number of comparisons on an average case. However, I’m considering a three-pivot quick sort instead. When utilizing two pivots, each element will be compared against one or both pivots, which may result in unbalanced cases. With three pivots, each element will always be compared against two pivots resulting in more consistent behavior.
Originally, I was planning on using a median-of-five to choose the three pivots. But then I realized that utilizing three pivots makes practical new possibilities, such as sorting multiple elements, or even a median-of-medians. But more importantly, I don’t want to add unnecessary overhead and previous experience simply using a median-of-five have found it to produce satisfactory results. If evaluating more than five elements, the pivots would be chosen in this style:
1 2 3 4 5
1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9 10 11 12 13
the first and last elements are considered to be min/max elements
Stable partition3 / topN Functions
These functions in Phobos, partition3 and topN, are currently lacking stable implementations, so I’d like to provide implementations for these as well. I’ve already implemented a stable partition3 in my stable quick sort module (link), so no problem there. However, implementing a stable topN is a much greater challenge, especially if your goal includes less than O(n) space complexity.
Currently, the unstable topN implementation works by recursively partitioning the list in-place until the nth element is discovered. This means the original order of elements is lost, including elements equal to the nth element may end up in the wrong partition.
Perhaps what constitutes a stable topN is not well defined. For me, the most important quality of a stable topN is that the original order of all elements is retained, including elements equal to the nth element. Plus, there may be two desired outcomes:
- The nth element is placed in the nth position
- The nth element remains in its original position relative to all other elements
The first outcome matches the definition of topN, whereas the second definition means to partition the list into two using the nth element as the pivot. In either case, the challenge is in finding the nth element without losing the original order of elements, which is further complicated by cases where other elements are equal to the nth element. If you can find the nth element without any reordering, than partitioning the list into two is a piece of cake.