Skip to main content
added the information, that . and / should generally work, based on mirabilos findings... also I'd dispute that the original solution works in dash, as this doesn't support switching the locale during runtime
Source Link

Beware that some shells don't support changing the locale during runtime (despite this is required by POSIX).

Solution that should generally work without changing the locale

While the above should work with any (except newline or null) byte as sentinel value, it can be made easier, without changing the locale:

Using . or / should be generally fine, as POSIX requires:

  • “The encoded values associated with <period>, <slash>, <newline>, and <carriage-return> shall be invariant across all locales supported by the implementation.”, which means that these will have the same binary represenation in any locale/encoding.
  • “Likewise, the byte values used to encode <period>, <slash>, <newline>, and <carriage-return> shall not occur as part of any other character in any locale.”, which means that the above cannot happen, as no partial byte sequence could be completed by these bytes/characters to a valid character in any locale/encoding. (see 6.1 Portable Character Set)

The above does not apply to other characters of the Portable Character Set.

Beware that some shells don't support changing the locale during runtime (despite this is required by POSIX).

Solution that should generally work without changing the locale

While the above should work with any (except newline or null) byte as sentinel value, it can be made easier, without changing the locale:

Using . or / should be generally fine, as POSIX requires:

  • “The encoded values associated with <period>, <slash>, <newline>, and <carriage-return> shall be invariant across all locales supported by the implementation.”, which means that these will have the same binary represenation in any locale/encoding.
  • “Likewise, the byte values used to encode <period>, <slash>, <newline>, and <carriage-return> shall not occur as part of any other character in any locale.”, which means that the above cannot happen, as no partial byte sequence could be completed by these bytes/characters to a valid character in any locale/encoding. (see 6.1 Portable Character Set)

The above does not apply to other characters of the Portable Character Set.

Commonmark migration
Source Link

#About a trailing x.

About a trailing x.

#About a trailing x.

About a trailing x.

Simplified, general answer.
Source Link
user232326
user232326
#!/bin/bash

f()           { for i in $(seq "$((RANDOM % 3 ))"); do
                    echo;
                done; return $((RANDOM % 256));
              }

exact_output(){ out=$( $1; ret=$?; echo x; exit "$ret" );
                unset OldLC_ALL ; [ "${LC_ALL+set}" ] && OldLC_ALL=$LC_ALL
                LC_ALL=C ; out=${out%x};
                unset LC_ALL ; [ "${OldLC_ALL+set}" ] && LC_ALL=$OldLC_ALL
                 printf 'Output:%q\nExit%10q\nExit :%s\n'%2s\n' "${out%xout}" "$?"
               }

exact_output f
echo Done

That is required because the last new line(S) are removed by the command expansion per [POSIX specification][a1]POSIX specification:

For bash 3.1+ or zsh you can use -v:

printf -v s '%s' "$1"

Lets add a byte, an ASCII byte (<127), and to keep things as littleless convoluted as possible, let's say an ASCII character in the range of a-z. Or as we should be saying it, a byte in the hex range 0x61 - 0x7a. Lets choose any of those, maybe an x (really a byte of value 0x78). We can add such byte with by concatenating an x to an string (lets assume an é):

However, that is very easy to solve, lets tell all those shells to do byte removal:

$ ( LC_ALL=C; echo ${b%$c} ) | od -vAn -tx1c 
#!/bin/bash
LC_ALL=zh_HK.big5hkscs

a=$(printf '\210\170');
b=$(printf '\170');

LC_ALL=C command eval 'a="${a%"$b"}"'

printf '%s' "$a" | od -vAn -c

Or you could use the more portable solution:

OldLC=$LC_ALL; LC_ALL=C ; a=${a%"$b"}; LC_ALL=$OldLC

This idea applied to your question:

exact_output(){ out=$#!/bin/bash

LC_ALL=zh_HK.big5hkscs

a=$( $1;printf ret=$?'\210\170'); echo x; exit "$ret"
b=$(printf '\170');
          
unset OldLC_ALL ; [ "${LC_ALL+set}" ] OldLC=$LC_ALL;&& OldLC_ALL=$LC_ALL
LC_ALL=C ; out=$a=${out%xa%"$b"}; LC_ALL=$OldLC
               unset printfLC_ALL 'Output:%q\nExit; :%s\n'[ "${outOldLC_ALL+set}" "$?"
      ] && LC_ALL=$OldLC_ALL

printf '%s' "$a" | od -vAn }-c

That will remove the problem of encoding. [a1]:http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_03

#!/bin/bash

f()           { for i in $(seq "$((RANDOM % 3 ))"); do
                    echo;
                done; return $((RANDOM % 256));
              }

exact_output(){ out=$( $1; ret=$?; echo x; exit "$ret" );
                  printf 'Output:%q\nExit :%s\n' "${out%x}" "$?"
               }

exact_output f
echo Done

That is required because the last new line(S) are removed by the command expansion per [POSIX specification][a1]:

For bash 3.1+ or zsh you can use -v:

printf -v s '%s' "$1"

Lets add a byte, an ASCII byte (<127), and to keep things as little convoluted as possible, let's say an ASCII character in the range of a-z. Or as we should be saying it, a byte in the hex range 0x61 - 0x7a. Lets choose any of those, maybe an x (really a byte of value 0x78). We can add such byte with by concatenating an x to an string (lets assume an é):

However, that is very easy to solve, lets tell those shells to do byte removal:

$ ( LC_ALL=C; echo ${b%$c} ) | od -vAn -tx1c 
#!/bin/bash
LC_ALL=zh_HK.big5hkscs

a=$(printf '\210\170');
b=$(printf '\170');

LC_ALL=C command eval 'a="${a%"$b"}"'

printf '%s' "$a" | od -vAn -c

Or you could use the more portable solution:

OldLC=$LC_ALL; LC_ALL=C ; a=${a%"$b"}; LC_ALL=$OldLC

This idea applied to your question:

exact_output(){ out=$( $1; ret=$?; echo x; exit "$ret" );
                OldLC=$LC_ALL; LC_ALL=C ; out=${out%x}; LC_ALL=$OldLC
                printf 'Output:%q\nExit :%s\n' "${out}" "$?"
              }

That will remove the problem of encoding. [a1]:http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_03

#!/bin/bash

f()           { for i in $(seq "$((RANDOM % 3 ))"); do
                    echo;
                done; return $((RANDOM % 256));
              }

exact_output(){ out=$( $1; ret=$?; echo x; exit "$ret" );
                unset OldLC_ALL ; [ "${LC_ALL+set}" ] && OldLC_ALL=$LC_ALL
                LC_ALL=C ; out=${out%x};
                unset LC_ALL ; [ "${OldLC_ALL+set}" ] && LC_ALL=$OldLC_ALL
                 printf 'Output:%10q\nExit :%2s\n' "${out}" "$?"
               }

exact_output f
echo Done

That is required because the last new line(S) are removed by the command expansion per POSIX specification:

Lets add a byte, an ASCII byte (<127), and to keep things as less convoluted as possible, let's say an ASCII character in the range of a-z. Or as we should be saying it, a byte in the hex range 0x61 - 0x7a. Lets choose any of those, maybe an x (really a byte of value 0x78). We can add such byte with by concatenating an x to an string (lets assume an é):

However, that is very easy to solve, lets tell all those shells to do byte removal:

$ LC_ALL=C; echo ${b%$c} | od -vAn -tx1c 
#!/bin/bash

LC_ALL=zh_HK.big5hkscs

a=$(printf '\210\170');
b=$(printf '\170');

unset OldLC_ALL ; [ "${LC_ALL+set}" ] && OldLC_ALL=$LC_ALL
LC_ALL=C ; a=${a%"$b"};
unset LC_ALL ; [ "${OldLC_ALL+set}" ] && LC_ALL=$OldLC_ALL

printf '%s' "$a" | od -vAn -c

That will remove the problem of encoding.

Applied the solution for encoding.
Source Link
user232326
user232326
Loading
s/LANG/LC_ALL/, typo
Source Link
user232326
user232326
Loading
Added the detail of why an x works (as most byte values) properly used.
Source Link
user232326
user232326
Loading
Match OP requested function.
Source Link
user232326
user232326
Loading
Adapted to new (edited) question.
Source Link
user232326
user232326
Loading
Source Link
user232326
user232326
Loading