Skip to main content
deleted 6 characters in body
Source Link

I found bash ignores binary zero on input when reading using the read buildin command. Is there a way around that?

The task is reading from a pipe that delivers binary data chunks of 12 bytes at a time, i.e. 2 ints of 16 bit and 2 ints of 32 bit. Data rate is low, performance no issue. Since bash variables are C-style, the obvious read -N 12 struct does not work, bytes beyond a NUL are not accessible. So I figured I need to read the data byte by byte, using read -N 1 byte. Problems easy to fix are escapes (requires -r), and UTF multi-byte character coding (export LC_ALL=C). The problem I'm so far unable to solve is to deal with zero bytes. I thought they'd show up as empty variable byte, but in fact read -r -N 1 byte does not return at all upon zero (ignores zeros) but returns with the next following non-zero byte in the data stream.

This is what I was attempting to do, which, as long as no zero comes in, works without flaws:

export LC_ALL=C

while true;
  do
     for ((index = 0; index < 12; index++))
       do
          read -r -N 1 byte
          if [ -n "${byte}" ]; then
               struct[${index}]=$(echo -n "${byte}" | od -An -td1)
             else
               struct[${index}]=0
            fi
       done
... # some arithmetics reconstructing the four bitfields and processing them
  done < pipe

It turns out the else branch in the if is never taken. A data chunk of 12 bytes that includes a zero does not make the for loop run 12 times, instead it awaits more data to fill the struct array. I demonstrated the behaviour by feeding the pipe 12 bytes using the command

echo -en "ABCDE\tGH\0JKL" > pipe

Since it is so easy to fool oneself with this, I verified the sending of zeros with

~# mkfifo pipe
~# od -An -td1 </root/pipe<pipe &
[1] 25512
~# echo -en "ABCDE\tGH\0JKL" > pipe
~#    65   66   67   68   69    9   71   72    0   74   75   76

[1]+  Done                    od -An -td1 < /root/pipe

Is there a way to change this behaviour of bash? Or how else can the zero bytes be read?

I found bash ignores binary zero on input when reading using the read buildin command. Is there a way around that?

The task is reading from a pipe that delivers binary data chunks of 12 bytes at a time, i.e. 2 ints of 16 bit and 2 ints of 32 bit. Data rate is low, performance no issue. Since bash variables are C-style, the obvious read -N 12 struct does not work, bytes beyond a NUL are not accessible. So I figured I need to read the data byte by byte, using read -N 1 byte. Problems easy to fix are escapes (requires -r), and UTF multi-byte character coding (export LC_ALL=C). The problem I'm so far unable to solve is to deal with zero bytes. I thought they'd show up as empty variable byte, but in fact read -r -N 1 byte does not return at all upon zero (ignores zeros) but returns with the next following non-zero byte in the data stream.

This is what I was attempting to do, which, as long as no zero comes in, works without flaws:

export LC_ALL=C

while true;
  do
     for ((index = 0; index < 12; index++))
       do
          read -r -N 1 byte
          if [ -n "${byte}" ]; then
               struct[${index}]=$(echo -n "${byte}" | od -An -td1)
             else
               struct[${index}]=0
            fi
       done
... # some arithmetics reconstructing the four bitfields and processing them
  done < pipe

It turns out the else branch in the if is never taken. A data chunk of 12 bytes that includes a zero does not make the for loop run 12 times, instead it awaits more data to fill the struct array. I demonstrated the behaviour by feeding the pipe 12 bytes using the command

echo -en "ABCDE\tGH\0JKL" > pipe

Since it is so easy to fool oneself with this, I verified the sending of zeros with

~# mkfifo pipe
~# od -An -td1 </root/pipe &
[1] 25512
~# echo -en "ABCDE\tGH\0JKL" > pipe
~#    65   66   67   68   69    9   71   72    0   74   75   76

[1]+  Done                    od -An -td1 < /root/pipe

Is there a way to change this behaviour of bash? Or how else can the zero bytes be read?

I found bash ignores binary zero on input when reading using the read buildin command. Is there a way around that?

The task is reading from a pipe that delivers binary data chunks of 12 bytes at a time, i.e. 2 ints of 16 bit and 2 ints of 32 bit. Data rate is low, performance no issue. Since bash variables are C-style, the obvious read -N 12 struct does not work, bytes beyond a NUL are not accessible. So I figured I need to read the data byte by byte, using read -N 1 byte. Problems easy to fix are escapes (requires -r), and UTF multi-byte character coding (export LC_ALL=C). The problem I'm so far unable to solve is to deal with zero bytes. I thought they'd show up as empty variable byte, but in fact read -r -N 1 byte does not return at all upon zero (ignores zeros) but returns with the next following non-zero byte in the data stream.

This is what I was attempting to do, which, as long as no zero comes in, works without flaws:

export LC_ALL=C

while true;
  do
     for ((index = 0; index < 12; index++))
       do
          read -r -N 1 byte
          if [ -n "${byte}" ]; then
               struct[${index}]=$(echo -n "${byte}" | od -An -td1)
             else
               struct[${index}]=0
            fi
       done
... # some arithmetics reconstructing the four bitfields and processing them
  done < pipe

It turns out the else branch in the if is never taken. A data chunk of 12 bytes that includes a zero does not make the for loop run 12 times, instead it awaits more data to fill the struct array. I demonstrated the behaviour by feeding the pipe 12 bytes using the command

echo -en "ABCDE\tGH\0JKL" > pipe

Since it is so easy to fool oneself with this, I verified the sending of zeros with

~# mkfifo pipe
~# od -An -td1 <pipe &
[1] 25512
~# echo -en "ABCDE\tGH\0JKL" > pipe
~#    65   66   67   68   69    9   71   72    0   74   75   76

[1]+  Done                    od -An -td1 < /root/pipe

Is there a way to change this behaviour of bash? Or how else can the zero bytes be read?

Source Link

How to read binary data including zero bytes using BASH builtin read?

I found bash ignores binary zero on input when reading using the read buildin command. Is there a way around that?

The task is reading from a pipe that delivers binary data chunks of 12 bytes at a time, i.e. 2 ints of 16 bit and 2 ints of 32 bit. Data rate is low, performance no issue. Since bash variables are C-style, the obvious read -N 12 struct does not work, bytes beyond a NUL are not accessible. So I figured I need to read the data byte by byte, using read -N 1 byte. Problems easy to fix are escapes (requires -r), and UTF multi-byte character coding (export LC_ALL=C). The problem I'm so far unable to solve is to deal with zero bytes. I thought they'd show up as empty variable byte, but in fact read -r -N 1 byte does not return at all upon zero (ignores zeros) but returns with the next following non-zero byte in the data stream.

This is what I was attempting to do, which, as long as no zero comes in, works without flaws:

export LC_ALL=C

while true;
  do
     for ((index = 0; index < 12; index++))
       do
          read -r -N 1 byte
          if [ -n "${byte}" ]; then
               struct[${index}]=$(echo -n "${byte}" | od -An -td1)
             else
               struct[${index}]=0
            fi
       done
... # some arithmetics reconstructing the four bitfields and processing them
  done < pipe

It turns out the else branch in the if is never taken. A data chunk of 12 bytes that includes a zero does not make the for loop run 12 times, instead it awaits more data to fill the struct array. I demonstrated the behaviour by feeding the pipe 12 bytes using the command

echo -en "ABCDE\tGH\0JKL" > pipe

Since it is so easy to fool oneself with this, I verified the sending of zeros with

~# mkfifo pipe
~# od -An -td1 </root/pipe &
[1] 25512
~# echo -en "ABCDE\tGH\0JKL" > pipe
~#    65   66   67   68   69    9   71   72    0   74   75   76

[1]+  Done                    od -An -td1 < /root/pipe

Is there a way to change this behaviour of bash? Or how else can the zero bytes be read?