How to convert a Uint8Array into a boolean (bit) array without using strings?

Question

I'm using javascript, and I'm little rusty on my bit arithmetic.

Ultimately, my goal is to convert a UInt8Array into 11-bit numbers for use with a bip39 wordlist for converting a libsodium private box key to a mnemonic (I'm building a small p2p-ish chat app).

So, my thought process is:

Uint8Array is returned from libsodium.crypto_box_keypair()
Convert Uint8Array into a 256bit (boolean) array
divide the 256bit array into 11bit buckets (2d array: ~24 x 11bits)
convert each 11bit array to a base 10 number (between 0 and 2047)

Steps 2, 3, and 4 can be combined into the same loop.

The goal of all this is to efficiently convert from a Uint8Array to an array of 11bit numbers (efficient for the computer -- this hasn't been efficient for myself).

I have this at the moment, but it isn't quite right, and feels somewhat hacky (just from steps 2 and 3, where I try to create the 11bit buckets)

// inspired from: https://github.com/pvorb/node-md5/issues/25
export function toUint11Array(input: Uint8Array): boolean[][] {
  let result: boolean[][] = [];
  let currentChunk: boolean[] = [];

  input.forEach(byte => {
    for (var j = 7; j >= 0; j--) {
      var b = ((byte >> j) & 0x1) > 0;

      if (currentChunk.length === 11) {
        result.push(currentChunk);
        currentChunk = [];
      }

      currentChunk.push(b);
    }
  });

  return result;
}

Currently, for 2048, I get 2 11 bit arrays (expected), but the content / order is unexpected.

  [
    false, false, false, false,
    false, false, false, false,
    false, false, false
  ],
  [ false, true, false, false,
    false, false, false, false, 
    false, false, false
  ]

2048 is 0b100_000_000_000

where the 12th digit from the right is the 1 (added underscores for easier reading)

so maybe it looks like I have an endianness problem and and off by one issue? because the true in my dual array is the 13th position from the left.

though, when I test with 4096 (13 bits (0b1_000_000_000_000)), I get this:

  [
    false, false, false, false,
    false, false, false, false,
    false, false, false
  ],
  [
    true, false, false, false,
    false, false, false, false,
    false, false, false
  ],
  [
    false, false, false, false,
    false, false, false, false,
    false, false, false
  ]

Here, the true is 12th from the left, and 22nd from the right.

Update

per @bergi, who asked about endianness.

I don't know what endianness this is. :-\

Update 2

Thanks to @harold for coming up with the answer. I have some tests that I think confirm correctness.

  const numbers = {
    ['32']: new Uint8Array([32]),
    ['64']: new Uint8Array([64]),
    ['2048']: new Uint8Array([8, 0]),
    ['4096']: new Uint8Array([16, 0]),
    ['7331']: new Uint8Array([28, 163])
  }

  test ('toUint11Array | converts | 32 (8 bits)', function(assert) {
    const result = toUint11Array(numbers['32']);
    const expected = [32];

    assert.deepEqual(result, expected);
  });

  test ('toUint11Array | converts | 2048 (12 bits)', function(assert) {
    const result = toUint11Array(numbers['2048']);
    const expected = [8, 0];

    assert.deepEqual(result, expected);
  });

  test ('toUint11Array | converts | 4096 (13 bits)', function(assert) {
    const result = toUint11Array(numbers['4096']);
    const expected = [16, 0];

    assert.deepEqual(result, expected);
  });



 test ('toUint11Array | converts | 7331 (13 bits)', function(assert) {
    const result = toUint11Array(numbers['7331']);
    const expected = [3, 1187];

    assert.deepEqual(result, expected);
  });

the first 3 pass, but the last does not.

when converting a Uint8Array(28, 163), I get [796, 28]

I'm not 100% sure that I converted 7331 into appropriate bytes correctly, but I did: 7331 = 0b1_1100_1010_0011 split: [1_1100, 1010_0011] -> [28, 163].

I suppose for the output, it should be: [11, 100_1010_0011] which is [3, 1187] which also doesn't match the output.

"Steps 2, 3, and 4 can be combined into the same loop" - I would recommend to use generator functions here, they should simplify the code a lot. — Bergi
– Bergi, Commented May 10, 2018 at 21:37
@Bergi yes, it's a 32 Byte Uint8Array Not sure about endianness. I'd like to stick with what is being used by default (I'll post a screenshot from the console). Generator Funciton? how so? I haven't actually written one before? (Just used them with redux sagas, and ember-concurrency) — NullVoxPopuli
– NullVoxPopuli, Commented May 10, 2018 at 22:25
There is no "default endianness", and your screenshot just shows a bunch of numbers. — Bergi
– Bergi, Commented May 11, 2018 at 11:44

user555045 · Accepted Answer · 2018-05-11 05:31:58Z

2

I'm not as rusty on my bit arithmetic, so I propose a method without temporary boolean arrays:

read bytes into a buffer until there are at least 11 bits
extract 11 bits from the buffer
repeat

Actually a funny thing happens at the end since 11 does not divide 256, I assume padding with zeroes is OK.

So maybe something like this in JavaScript (but I am a little rusty on my JS)

function toUint11Array(input) {
    var buffer = 0, numbits = 0;
    var output = [];

    for (var i = 0; i < input.length; i++) {
        // prepend bits to buffer
        buffer |= input[i] << numbits;
        numbits += 8;
        // if there are enough bits, extract 11bit chunk
        if (numbits >= 11) {
            output.push(buffer & 0x7FF);
            // drop chunk from buffer
            buffer = buffer >> 11;
            numbits -= 11;
        }
    }
    // also output leftover bits
    if (numbits != 0)
        output.push(buffer);

    return output;
}

answered May 11, 2018 at 5:31

user555045

65.8k6 gold badges95 silver badges181 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

NullVoxPopuli Over a year ago

cool, thanks for this! I need to write a couple tests, and write the inverse of this to make sure conversion is working in both directions.

NullVoxPopuli Over a year ago

actually, Assuming I did math right, I think this works. (I haven't made the inverse yet, but I just made some Uint8 to Uint11 tests. I'll update my question with results.

NullVoxPopuli Over a year ago

I found a scenario where toUint11Array fails (I think) for 7331 (but as a Uint8Array) see my Update 2. I tried doing some google-assisted decimal -> binary -> decimal conversions, that could be where my error is in my tests. unsure atm.

user555045 Over a year ago

@NullVoxPopuli are you sure about the expected result? I get [1187, 3] which makes sense, the 3 part are the high bits

NullVoxPopuli Over a year ago

what are you using for your input array? I manually got [1187, 3], but my test isn't getting that.

|

Bergi · Accepted Answer · 2018-05-11 13:31:24Z

1

I'd go for generators:

function* bitsFromOctetsLE(octets) {
    for (let byte of octets) {
        for (let i=0; i<8; i++) {
            yield (byte & 1);
            byte >>= 1;
        }
    }
}
function* hendecadsFromBitsLE(bits) {
    let i=0;
    let val=0;
    for (const bit of bits) {
        if (i==11) {
            yield val;
            i = val = 0;
        }
        val |= bit << (i++);
    }
    if (i > 0) yield val;
}

Then use them like

const input = libsodium.crypto_box_keypair();
const result = Array.from(hendecadsFromBitsLE(bitsFromOctetsLE(input)))

edited May 11, 2018 at 13:31

answered May 11, 2018 at 13:21

Bergi

671k162 gold badges1k silver badges1.5k bronze badges

6 Comments

NullVoxPopuli Over a year ago

what's the advantage of yielding val and byte & 1?

NullVoxPopuli Over a year ago

oh I see now, when yielded, the other function evaluates. but there is no final return value?

Bergi Over a year ago

Advantage over what?

Bergi Over a year ago

@NullVoxPopuli The return value of a generator function call is a generator that produces the yielded values. Array.from will collect them into an array in the end.

NullVoxPopuli Over a year ago

ok, I figured it out, Array.from handles the yield conversion for me. But also, it looked like I was consistently missing the last section of data. so in hendecadsFromBitsLE, I added a yield val; at the end of the function, and now your answer yields the same results as @harold's answer. :-) Now I just need to figure out what's wrong with the 7331 test.

|

Collectives™ on Stack Overflow

How to convert a Uint8Array into a boolean (bit) array without using strings?

Update

Update 2

2 Answers 2

9 Comments

6 Comments

Hot Network Questions

Collectives™ on Stack Overflow

Update

Update 2

2 Answers 2

9 Comments

6 Comments

Related