Skip to main content
added 966 characters in body
Source Link
Cris Luengo
  • 7k
  • 1
  • 15
  • 37

With imgthis memory being private, you'll have to add pixel access functions. You already have getGrayscale() and setGrayscale(). You could consider, for example, defining unsigned int& operator()(int i, int j, int c). This returns a reference to a value in the image, so you can assign to it, image(4, 8, 0) = 5 would be valid. Of course you could also give it a regular function name, unsigned int& getValue(int i, int j, int c) for example; you'll still be able to assign to it: image.value(4, 8, 0) = 5.

Be very careful not to define a std::vector<std::vector<uint>> though. It is hard to initialize, and very inefficient. Just have a std::vector<uint>, and write your pixel getting/setting functions to compute the index. Outside of your class code you will never know or think about how it's implemented, you will always just use normal indexing. If you make the number of channels a parameter as well, you'll be able to use this class also for gray-scale images, which is what a lot of your code works with anyway.

Though for some advanced algorithms, it is actually really useful to do pointer arithmetic to access a pixel's neighbors. For those algorithms you will want to implement a function that returns a pointer to the first pixel (or to a pixel at a given coordinate), and you will want to have a function that says how many values to skip to get to the next pixel on the left/right (3 in your case) and bottom/top (3 * width in your case).

With img being private, you'll have to add pixel access functions. You already have getGrayscale() and setGrayscale(). You could consider, for example, defining unsigned int& operator()(int i, int j, int c). This returns a reference to a value in the image, so you can assign to it, image(4, 8, 0) = 5 would be valid. Of course you could also give it a regular function name, unsigned int& getValue(int i, int j, int c) for example; you'll still be able to assign to it: image.value(4, 8, 0) = 5.

With this memory being private, you'll have to add pixel access functions. You already have getGrayscale() and setGrayscale(). You could consider, for example, defining unsigned int& operator()(int i, int j, int c). This returns a reference to a value in the image, so you can assign to it, image(4, 8, 0) = 5 would be valid. Of course you could also give it a regular function name, unsigned int& getValue(int i, int j, int c) for example; you'll still be able to assign to it: image.value(4, 8, 0) = 5.

Be very careful not to define a std::vector<std::vector<uint>> though. It is hard to initialize, and very inefficient. Just have a std::vector<uint>, and write your pixel getting/setting functions to compute the index. Outside of your class code you will never know or think about how it's implemented, you will always just use normal indexing. If you make the number of channels a parameter as well, you'll be able to use this class also for gray-scale images, which is what a lot of your code works with anyway.

Though for some advanced algorithms, it is actually really useful to do pointer arithmetic to access a pixel's neighbors. For those algorithms you will want to implement a function that returns a pointer to the first pixel (or to a pixel at a given coordinate), and you will want to have a function that says how many values to skip to get to the next pixel on the left/right (3 in your case) and bottom/top (3 * width in your case).

Source Link
Cris Luengo
  • 7k
  • 1
  • 15
  • 37

You already have two answers discussing #define, using namespace std, compile-time image size, and a bunch of other things. I will avoid repeating those points and discuss some other things:

  1. Documentation: Make sure each function is documented. You'll thank me next year when you don't remember what you meant by thinEdges(). :)

  2. Formatting: Some functions don't have space around them, they are declared on the very next line after the previous function. Empty lines around functions make the file more readable. I found it hard to find where you defined conv2d(), for example.

  3. Images are most often stored as 8-bit or 16-bit unsigned integers, or as floating-point values. The former is how they come off the sensor, and how they're shown to screen. The floating-point representation is useful for increased precision in computations. unsigned int is a 32-bit unsigned integer, and not terribly useful. If you used (signed) int, you could store the result of a Sobel or Laplace filter (which produce negative values). But I would have chosen float because it's so much easier (e.g. no overflow, division by 0 errors, ), and it takes up the same amount of space! You could also parameterize the type by turning the class into a template, but that's a bit more advanced.

  4. height and width are public variables. Someone using the class could change those values, making the object inconsistent and causing out-of-bounds memory access later on. Make these variables private, and add a function to read their values.

  5. What should be class methods and what should be free functions? I am not a big fan of image processing functions being methods to the image class. I think things are clearer when these are free functions that take an image as input. This would remove one reason for the inconsistency in your function calls. For example, out = img.canny() is not as pretty to me as out = canny(img). But when multiple images are involved, the problem becomes more obvious. Your combineImgs(img1, img2) uses *this as the output, whereas other functions use *this as the input. As a free function, it's much more natural to define a function that takes two images as input: out = combineImgs(img1, img2).

    Class methods should be functions that work directly on the object, for example accessors, operators, and functions that modify it (e.g. change the image size).

  6. combineImgs() adds up two images. Why not make this into an operator? You can declare RGBImage operator+(RGBImage const&, RGBImage const&), then you can use out = img1 + img2. This is (IMO) more expressive than combineImgs(), since there are so many different ways to combine images.

  7. Actually, on second reading, it looks like canny() does not use *this at all, it takes an input image as other? Is *this used? If not, why is this a class method?

  8. canny() takes sobelx and sobely kernels as input. This would make the function much harder to use: these kernels are fixed and never change, so you are asking the user to always put in the same values here. On the other hand, the original Canny doesn't use Sobel filters, it uses Gaussian derivatives, so you'd expect a sigma parameter for the Gaussian filter to be a parameter to canny().

  9. The Canny filter has several stages: derivatives (you've got the conv2d() call here), non-maximum suppression, hysteresis threshold, and thinning (I don't see you using the thinning, this is something you could add). All three of these steps are useful outside the Canny filter. If you make these into separate functions, then the canny() function will be shorter and easier to read, and you'll have added functionality to your library! For example, this is how I implemented Canny in DIPlib, notice how the largest part of that function is the logic to compute thresholds from the inputs provided by the user.

  10. Every time you do a triple loop (over i/x, j/y and d/channel) and apply the same operation to the value inside, you could simply use a single loop. unsigned int img[RGB_H][RGB_W][RGB_D] declares a contiguous memory block, you can simply loop over each integer using a single loop.

  11. R_COEFF and the other two values are only used in one place. There is one function that converts RGB to grayscale. Declare these constants in that function, so that (1) it's easy to find their definition when you're reading the function, and (2) it's not as hard to find a name (being a global variable it must be unique within the file, inside the function there's a much smaller scope so name clashes are less likely). This becomes increasingly important as your project grows in size.

  12. There's a bug in your code: if(i < 0 || j < 0 || i > RGB_H || i > RGB_W) doesn't do the check properly. If i == RGB_W, then you'll be writing out of bounds. It should be if(i < 0 || j < 0 || i >= RGB_H || i >= RGB_W). But instead of using the hard-coded constants, you should be using the height and width variables here. At some point you'll improve your code to have dynamic image sizes, and it'll be a lot more work then to change the code to use your size variables instead of the constants.

  13. Another bug: getGrayscale() returns an int instead of an unsigned int.


Others have suggested the use of std::vector<> to store the image data. If you do that, it must be a private member, in the same way that your image sizes must be private members. You'll have a class invariance that is width * height * 3 == img.size(), and you need to be able to guarantee that this is always true. Making these variables private is the only way you can do that.

With img being private, you'll have to add pixel access functions. You already have getGrayscale() and setGrayscale(). You could consider, for example, defining unsigned int& operator()(int i, int j, int c). This returns a reference to a value in the image, so you can assign to it, image(4, 8, 0) = 5 would be valid. Of course you could also give it a regular function name, unsigned int& getValue(int i, int j, int c) for example; you'll still be able to assign to it: image.value(4, 8, 0) = 5.