7
\$\begingroup\$

Intro

This is the very first time I actually expose myself to C++20 (concepts in this case). I decided to rewrite the Batcher's sort.

(This post has a continuation: Batcher's sort in C++20 - Take II.)


Code

batcherssort.h

#ifndef IO_GITHUB_CODERODDE_UTIL_BATCHERSSORT_H
#define IO_GITHUB_CODERODDE_UTIL_BATCHERSSORT_H

#include <cmath>
#include <concepts>
#include <utility>

#ifdef _OPENMP
#include <omp.h>
#endif

template <std::totally_ordered T>
void batchersSort(T* begin, T* end) {
    auto n = end - begin;

    if (n < 2) {
        return;
    }

    auto t = (size_t) std::ceil(std::log2(n));
    auto s = 1 << (t - 1);

    for (size_t p = s; p > 0; p /= 2) {
        size_t r = 0;
        size_t d = p;

        for (size_t q = s; q >= p; q /= 2) {

            #ifdef _OPENMP
            #pragma omp parallel for schedule(static)
            #endif 
            for (std::ptrdiff_t k = 0; k < n - static_cast<std::ptrdiff_t>(d); ++k) {
                if ((static_cast<std::size_t>(k) & p) == r) {
                    if (begin[k] > begin[k + d]) {
                        std::swap(begin[k], begin[k + d]);
                    }
                }
            }

            d = q - p;
            r = p;
        }
    }
}

#endif // IO_GITHUB_CODERODDE_UTIL_BATCHERSSORT_H

main.cpp

#include "batcherssort.h"
#include <algorithm>
#include <chrono>
#include <iostream>
#include <limits>
#include <random>

static const size_t N = 100000;

static size_t millis() {
    return std::chrono::duration_cast<std::chrono::milliseconds>(
               std::chrono::system_clock::now().time_since_epoch())
        .count();
}

static void fill(int* arr1, int* arr2) {
    std::random_device rd;
    std::mt19937 gen(rd());
    std::uniform_int_distribution<int> dist(std::numeric_limits<int>::min(),
                                            std::numeric_limits<int>::max());

    for (size_t i = 0; i < N; ++i) {
        int val = dist(gen);
        arr1[i] = val;
        arr2[i] = val;
    }
}

int main()
{
    int* arr1 = new int[N];
    int* arr2 = new int[N];

    fill(arr1, arr2);

    auto ta = millis();
    std::sort(arr1, arr1 + N);
    auto tb = millis();

    std::cout << "std::sort: " << (tb - ta) << " ms.\n";

    ta = millis();
    batchersSort(arr2, arr2 + N);
    tb = millis();

    std::cout << "batchersSort: " << (tb - ta) << " ms.\n";
    std::cout << "Algorithms agree: "
              << std::boolalpha
              << std::equal(arr1, arr1 + N, arr2)
              << "\n";
}

Typical output

std::sort: 11 ms.
batchersSort: 25 ms.
Algorithms agree: true

Critique request

As always, I am eager to hear some constructive commentary.

\$\endgroup\$
2
  • \$\begingroup\$ On what platform did you run your benchmark? I get results more like 1 ms/12 ms on a MacBook Air M4, code compiled with Xcode/clang, optimization level set to "Fastest [-Os]". \$\endgroup\$ Commented Jan 24 at 9:02
  • \$\begingroup\$ x64 Windows 11. Visual Studio 2026. \$\endgroup\$ Commented Jan 24 at 9:18

3 Answers 3

5
\$\begingroup\$

The C++ std::sort function has an optional third argument to provide a custom comparator. You can do the same with your sort function to make it more flexible:

template <class T, typename Comparator = std::less<T> >
void batchersSort(T* begin, T* end, const Comparator comp = Comparator()) {
    // ...

                    if (comp(begin[k+d], begin[k])) {
                        std::swap(begin[k], begin[k + d]);
                    }
    // ...
}

Now you can do things like

std::array a{ 1, 2, 3, 4, 5, 6 };
batchersSort(a.begin(), a.end(), std::greater{});

to sort the elements in descending order.

\$\endgroup\$
2
  • \$\begingroup\$ You shouldn’t use both std::totally_ordered and a comparator. The whole point of the comparator is that you can sort things that are aren’t ordered, totally or otherwise. \$\endgroup\$ Commented Jan 23 at 21:00
  • \$\begingroup\$ @indi: Yes, that makes sense. My C++ is a bit rusty, is it better now? \$\endgroup\$ Commented Jan 23 at 21:08
6
\$\begingroup\$

There are several headers that define std::size_t:

Defined in header <cstddef>
Defined in header <cstdio>
Defined in header <cstdlib>
Defined in header <cstring>
Defined in header <ctime>
Defined in header <cwchar>

However, we haven't included any of these.

Don't rely on any of these additionally defining global-namespace size_t - that's an option to the implementation, not a guarantee to the programmer.

We also need to include <cstddef> for std::ptrdiff_t.

Conversely, we include <omp.h> but don't use any of its definitions.


n, t and s remain constant throughout their scope - I recommend adding const to their declaration/initialisation lines.

I'd probably prefer d to be a std::ptrdiff_t from the start, rather than casting in the loop control expression, where it makes the logic harder to read:

        auto d = static_cast<std::ptrdiff_t>(p);
            for (std::ptrdiff_t k = 0;  k < n - d;  ++k) {

C++20 provides a better way to get the required power of 2, without using floating-point arithmetic:

#if __cpp_lib_int_pow2 >= 202002L
    // #include <bit>
    auto const s = std::bit_ceil(static_cast<std::size_t>(n/2));
#else
    auto const t = (size_t) std::ceil(std::log2(n));
    auto const s = 1uz << (t - 1);
#endif

(note the use of std::size_t constant for s in the fallback case - int is a bad choice there)

\$\endgroup\$
3
  • \$\begingroup\$ Unless I am mistaken, s from the original code is the largest power of 2 less than n, whereas your std::bit_ceil version calculates the smallest power of 2 greater than or equal to n. So a small correction might be necessary. \$\endgroup\$ Commented Jan 23 at 15:55
  • 1
    \$\begingroup\$ I may have misunderstood the original, in which case there are other suitable functions (such as std::bit_floor()). I think that adds weight to my argument that the floating-point logic is hard to comprehend! \$\endgroup\$ Commented Jan 23 at 16:05
  • \$\begingroup\$ I fully agree. – std::bit_floor(static_cast<std::size_t>(n-1)) should produce the same values as the original code. \$\endgroup\$ Commented Jan 23 at 16:30
6
\$\begingroup\$

Flexibility

The biggest opportunity I see, aside from adding the option for a customer comparator, is enabling your sort to work with any iterators over a container of std::totally_ordered values.

By using pointers, you've limited the usefulness of the function to essentially just the dynamically allocated arrays you're using. What if you wanted to sort a std::vector<int>?

Dynamically allocating memory

A bit pedantic, but in your test you've used new to allocate arrays, but there's no accompanying delete [] for each.

    int* arr1 = new int[N];
    int* arr2 = new int[N];

    ...

    delete [] arr1;
    delete [] arr2;

Or use smart pointers which would automatically free their memory on going out of scope.

    auto arr1 = std::make_unique<int[]>(N);
    auto arr2 = std::make_unique<int[]>(N);
\$\endgroup\$
1
  • 1
    \$\begingroup\$ std::vector might not be the best example to use for the flexibility observation, since it's specified to have contiguous storage accessible via its data() member function (excluding std::vector<bool> of course, which is a maverick). \$\endgroup\$ Commented Jan 26 at 8:04

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.