4
\$\begingroup\$

This simple class allows us to hide a byte[] inside the Scan0 of a Bitmap's BitmapData and later retrieve the byte[] hidden within. Is there any cases where padding would be an issue?

using System.Diagnostics;
using System.Drawing;
using System.Drawing.Imaging;
using System.Runtime.InteropServices;

public record class IMG(byte[] Data)
{
    const ImageLockMode Read = ImageLockMode.ReadOnly;
    
    const ImageLockMode Write = ImageLockMode.WriteOnly;
    
    const PixelFormat Format = PixelFormat.Format32bppArgb;
     
    // Start Size Approximation

    private readonly int Size = Data.Length >> 2;

    public int Width => GetSize(Math.Sqrt(Size));

    public int Height => GetSize((double)Size / Width);
            
    static int GetSize(double x) => (int)Math.Ceiling(x);

    // End Size Approximation
    
    public Bitmap Build()
    {
        var rect = new Rectangle(0, 0, Width, Height);                                
        var result = new Bitmap(Width, Height, Format);                
        var bitData = result.LockBits(rect, Write, Format);
        Marshal.Copy(Data, 0, bitData.Scan0, Data.Length);
        result.UnlockBits(bitData);
        return result;
    }

    [DebuggerStepThrough, DebuggerHidden]
    public static IMG? Load(Bitmap bitmap)
    {
        ArgumentNullException.ThrowIfNull(bitmap);
        if (bitmap.PixelFormat != Format)
            throw new NotSupportedException(nameof(bitmap));
        var rect = new Rectangle(0, 0, bitmap.Width, bitmap.Height);
        var bitData = bitmap.LockBits(rect, Read, Format);
        var size = bitData.Height * bitData.Stride;
        var result = new byte[size];
        Marshal.Copy(bitData.Scan0, result, 0, size);
        bitmap.UnlockBits(bitData);
        return new(result);                
    }
}
\$\endgroup\$
3
  • 2
    \$\begingroup\$ I cannot answer your question, but I was bothered by the following: You have two functions named GetValue doing something completely different. The name GetValue does not give any clue on what these functions are doing. You must read their implementations in order to understand and also be sure to understand which overload will be called. I would keep only the first one and name it IntCeiling and then get the width with public int Width => IntCeiling(Math.Sqrt(Size));. This adds a lot of clarity. \$\endgroup\$ Commented Oct 19 at 13:58
  • 2
    \$\begingroup\$ Minor: you should be using try..finally between LockBits and UnlockBits to ensure that they are unlocked no matter if an exception occurs in the interceding code. \$\endgroup\$ Commented Oct 19 at 18:55
  • 2
    \$\begingroup\$ Also, don't do any extra math that you don't have to do. Once you've calculated width and height, don't recalculate it: Rectangle rect = new(0, 0, Width, Height); Bitmap result = new(rect.Width, rect.Height, Format); (note using the width and height of the rectangle instead) \$\endgroup\$ Commented Oct 19 at 19:14

2 Answers 2

5
\$\begingroup\$

Is there any cases where padding would be an issue?

Padding is used to make rows of a bitmap a multiple of 4 bytes in size. Since you use a 4 byte per pixel format, all rows are automatically a multiple of 4 bytes in size already, there is no need for padding.

A downside of using a format with an alpha channel is that some image encoders or decoders may discard the RGB information of fully transparent pixels, turning it into for example all zeroes. So for example if you want to round-trip the bitmap through a PNG file, that may not quite work.

If the concern is the bytes of the bitmap past the end of data (as came up in the comments), I would put the length before the data. There may be more than 3 bytes of extraneous data past the "end" to fill out the rectangle (eg take length = 1009, then you will have Width = 32, Height = 32, and 32 * 32 is 1024, so 15 extra bytes in the bitmap that aren't used to store your data). Of course, adjust the calculations of the bitmap size to take into account that more than data.Length bytes are needed.


I don't quite agree with making this a record with a Build method. Essentially the compiler-generated constructor plus Build form a procedure together, with some computed-properties being used as functions.

\$\endgroup\$
4
  • \$\begingroup\$ Updated code, still uses primary constructor for data IMG(byte[] data) and returns data as public ReadOnlyMemory<byte> Data => data. and made the class a non record class \$\endgroup\$ Commented Oct 19 at 15:16
  • \$\begingroup\$ I quess my question about padding is that the input byte array that's being "hidden" copied to the scan0. When the byte array is copied from the scan0 if it would contain extra bytes on the end if the input wasn't length %4 == 0. The pdf that I used as input works as expected. \$\endgroup\$ Commented Oct 19 at 16:24
  • 1
    \$\begingroup\$ You could insert one byte before the data containing length % 4 and use it to calculate the correct length when extracting the data from the bitmap. \$\endgroup\$ Commented Oct 19 at 16:36
  • \$\begingroup\$ @OlivierJacot-Descombes thank you for your suggestion. I ended up using 4 bytes (int32) at the beginning to represent the original size and pad the image data to be multiple of 4 so that it conforms to overall size % 4 == 0 \$\endgroup\$ Commented Oct 20 at 14:03
5
\$\begingroup\$

IMG is not a PascalCase class name as it should be in .NET. Why not HiddenDataImage or something similar?

Likewise, Data should be data (camelCase).

That statement was incorrect. Record types use PascalCase [MSDN]:

When writing positional records, use pascal casing for parameters as they're the public properties of the record.

int Size = Data.Length >> 2; should not be a bit-shift operation. There is no bit manipulation in this class. This should probably be int Size = Data.Length / (BitsPerPixel/8);, where BitsPerPixel is 32. Let the compiler optimize that into a bit shift operation.

Any performance arguments are not valid to me in this code base. If you have the time to hide a byte array in a GDI+ Bitmap, performance is certainly not your primary problem. Benchmarks on my PC could not find a difference between the bitshift and the division. The JITed assembly code can be seen on Sharplap.

The throw new NotSupportedException(nameof(bitmap)) is not helpful. That exception should explain that only 32 bit ARGB images are supported.

Note that the byte[] from Load() is not identical to the stored one. Your bitmap will be a square. Thus, when you originally store 17 bytes, you get back 25 bytes. And those extra bytes may contain garbage.

Is there any cases where padding would be an issue?

Padding in a Bitmap is called Stride. As Microsoft says:

The stride is the width of a single row of pixels (a scan line), rounded up to a four-byte boundary.

So, in your case, as long as you stick with 32 bit ARGB, stride and width are identical. But you have the square padding issue. A solution might be to store the array size in the first 32 bits of the image, adding that as an overhead.

\$\endgroup\$
15
  • \$\begingroup\$ @user555045: sorry, I must have misread that as Width. \$\endgroup\$ Commented Oct 19 at 18:36
  • 1
    \$\begingroup\$ I do like the idea of storing the array size in the first 32 bits. When you have a record and primary constructor you will receive an error message for naming rule violation if Data is named data(CamelCase) \$\endgroup\$ Commented Oct 20 at 7:32
  • 1
    \$\begingroup\$ Fair enough, I’d argue that dismissing loop-scale impact while benchmarking per-instruction is like measuring rainfall with a teaspoon. Microseconds matter when they compound. But hey, Just don’t trip over the nanoseconds trying to prove a point that’s been settled since the Pentium era. Some of us optimize; others rediscover multiplication. I’m glad you’ll benchmark it data beats debate. \$\endgroup\$ Commented Oct 20 at 13:51
  • 1
    \$\begingroup\$ @ShortInt: Any decent compiler will compile x / 4 the same as x >> 2... for unsigned x or if it can prove x is always non-negative. If not, it won't use a div instruction (\@ThomasWeller), it'll use an arithmetic right shift and a correction to add 1 if x was negative (to round toward 0 instead of toward -inf like 2's complement arithmetic shifts do.) godbolt.org/z/nbbd3EPxK shows x86-64 GCC and AArch64 Clang. >>2 is always a single fast instruction, so makes sense if you can't simply use unsigned integer types, especially if it might not get hoisted out of a loop. \$\endgroup\$ Commented Oct 21 at 2:13
  • 1
    \$\begingroup\$ @ThomasWeller: Ah, I see, missed that in skimming. sharplab.io/… shows release-mode JITted asm from C#; they avoid cmov so it's 5 uops vs. 4, and with a longer critical path length (4c with mov-elimination or 5 without like on Ice Lake, vs. 3 cycles for GCC/Clang's test/lea + cmov + shift.) Anyway, even GCC/Clang's good strategy is worse than a simple sar for both speed and I-cache footprint. \$\endgroup\$ Commented Oct 21 at 6:13

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.