Skip to main content
added 608 characters in body
Source Link
G. Sliepen
  • 69.5k
  • 3
  • 75
  • 180
  • Create a class Histogram that hides how the histogram is actually stored internally, and provides a uniform way to access that data.
  • Always create a std::vector out of the returned histogram, since the line that declared probabilityVec would work just as well if probabilities was a std::array, at least if you let it autodeduce the value type:
    auto probabilityVec = std::vector(std::ranges::begin(probabilities), std::ranges::end(probabilities));
    
    Or even simpler:
    auto probabilityVec = probabilities | std::ranges::to<std::vector>();
    
    To ensure you get an index when applying this on a std::array, you could make your own adapter:
    auto probabilityVec = probabilities | add_index_if_missing | std::ranges::to<std::vector>();
    
    Where add_index_if_missing is a range adapter you have to write that applies std::views::enumerate if necessary.
  • Realizing the above, note that instead of iterating over the histogram using an integer index and [], you can use actual iterators. This means you'll have to write code like this:
    for (auto i = std::begin(probabilities); i != std::end(probabilities); ++i) {
        …
        for (auto j = std::begin(probabilities); j != i; ++j) {
            w_background += j->second;
            m_background += j->first * j->second;
        }
        …
    }
    
    
  • Create a class Histogram that hides how the histogram is actually stored internally, and provides a uniform way to access that data.
  • Always create a std::vector out of the returned histogram, since the line that declared probabilityVec would work just as well if probabilities was a std::array.
  • Realizing the above, note that instead of iterating over the histogram using an integer index and [], you can use actual iterators. This means you'll have to write code like this:
    for (auto i = std::begin(probabilities); i != std::end(probabilities); ++i) {
        …
        for (auto j = std::begin(probabilities); j != i; ++j) {
            w_background += j->second;
            m_background += j->first * j->second;
        }
        …
    }
    
    
  • Create a class Histogram that hides how the histogram is actually stored internally, and provides a uniform way to access that data.
  • Always create a std::vector out of the returned histogram, since the line that declared probabilityVec would work just as well if probabilities was a std::array, at least if you let it autodeduce the value type:
    auto probabilityVec = std::vector(std::ranges::begin(probabilities), std::ranges::end(probabilities));
    
    Or even simpler:
    auto probabilityVec = probabilities | std::ranges::to<std::vector>();
    
    To ensure you get an index when applying this on a std::array, you could make your own adapter:
    auto probabilityVec = probabilities | add_index_if_missing | std::ranges::to<std::vector>();
    
    Where add_index_if_missing is a range adapter you have to write that applies std::views::enumerate if necessary.
  • Realizing the above, note that instead of iterating over the histogram using an integer index and [], you can use actual iterators. This means you'll have to write code like this:
    for (auto i = std::begin(probabilities); i != std::end(probabilities); ++i) {
        …
        for (auto j = std::begin(probabilities); j != i; ++j) {
            w_background += j->second;
            m_background += j->first * j->second;
        }
        …
    }
    
    
Source Link
G. Sliepen
  • 69.5k
  • 3
  • 75
  • 180

Avoid code duplication

There's duplicated code in otsu_threshold() because normalize_histogram() returns completely different types depending on ElementT. There are several ways to avoid this:

  • Create a class Histogram that hides how the histogram is actually stored internally, and provides a uniform way to access that data.
  • Always create a std::vector out of the returned histogram, since the line that declared probabilityVec would work just as well if probabilities was a std::array.
  • Realizing the above, note that instead of iterating over the histogram using an integer index and [], you can use actual iterators. This means you'll have to write code like this:
    for (auto i = std::begin(probabilities); i != std::end(probabilities); ++i) {
        …
        for (auto j = std::begin(probabilities); j != i; ++j) {
            w_background += j->second;
            m_background += j->first * j->second;
        }
        …
    }
    
    

What about non-integral element types?

Your function only deals with images with integer pixel values. However, what if ElementT is float? If you think that floating point images would result in huge histograms, please realize that std::uint32_t has just as many possible values as a float does. I think your code will handle floats just fine if you just remove the constraint.

But this also brings me to the following:

Consider that the histogram might be huge

Your algorithm is \$O(N^2)\$ where \$N\$ is the size of the histogram. For 8-bit images, that's totally fine. For 16-bit images, it will execute the code in the inner loops \$2^32\$ times, regardless of the image size.

You can save some time by realizing that you don't need to recompute all of w_background/w_foreground/etc each time, as only one element is added/removed from the sums each iteration of the outer loop.

For 32-bit images, your code might actually be more or less efficient, depending on the number of distinct pixel values. Still, you must consider that the size of the histogram might be equal to the total number of pixels in the image, if each pixel has a unique value.

There several possible approaches you can take for dealing with huge histograms:

  • Just limit the number of histogram bins. This might result in a slightly optimal threshold value.
  • Use a heuristic or bisection to find the right value in less amount of time.