0

I instructed my terminal emulator to send "\u{85}" (C1 control character "NEXT LINE" (NEL)) down the pipe to be caught by bash:

bind -x '"\u0085":"echo Hello"'

But this doesn't trigger anything. So my first question is: How do I represent unicode codepoints for bind -x? Strange enough, bash picks up the octal representation:

bind -x '"\205":"echo Hello"'

In any case, the above creates weird artifacts in the shell:

Hello
 $ �Hello
Hello
 $ �

I figured the control character probably has some sideffects. So my second question is, which unicode range can I safely repurpose for my own good?

1 Answer 1

4

readline does care about character encoding in that for instance one stroke of Backspace would delete both the 0xc3 and 0xa9 bytes in a UTF-8 locale (where that's the encoding of é) but only 0xa9 in a ISO8859-1 locale where 0xc3 0xa9 is the encoding of two characters à and ©, but for its key binding, it doesn't.

The bindings bind arrays of bytes.

What syntax can be used to specify those bytes is described at info -n 'Readline Init File Syntax' readline (bash's bind just supplies readline with readline init file instructions).

So options to specify those bytes are:

  • just enter them as is, literally
  • as \ooo with their octal value
  • as \xhh with their hexadecimal value
  • for bytes 0 to 31 and 127 as \C-X with X in @, A..Z, [, \, ], ^, _, ?
  • for bytes 128 to 159 and 255 as \M-\C-X with same characters as above.
  • for bytes 160 to 254 as \M-X with X ranging from space to ~, the ASCII printable characters.

The U+0085 character is encoded as 0x85 in iso8859-x charsets, as 0xc2 0x85 in UTF-8, as 0x81 0x30 0x81 0x35 in GB18030. If you want to bind that, you'll need to know in what charmap your terminal sends that characters.

Your weird artefacts suggests it's UTF-8 encoded as it runs the echo hello command on the second byte of the UTF-8 encoding of U+0085, sending the first byte as-is for display which your terminal renders as the replacement character as that's invalid encoding.

Then, you'd need one of:

bind -x '"\302\205": "echo hello"'
bind -x '"\xc2\x85": "echo hello"'
bind -x '"\M-B\M-\C-E": "echo hello"'

Or send the bytes literally, by entering that (control) character literally inside the first pair of "..." or by using the Korn-style $'...' quotes inside which you can also use \ooo or \xhh or in bash 4.2 or newer the \uhhhh or \Uhhhhhhhh notations from zsh:

bind -x $'"\u0085": "echo hello"'

In bash, that \u0085 is encoded in the charmap of the locale setting at the time that code was read (not run like in zsh). If you don't change the locale midway through your ~/.bashrc that won't make a difference.

2
  • Thx for this comprehensive answer. There's a lot for me here to unpack. First I didn't know that terminals could send different charsets than utf8. Also, do you know by any chance since when the bash Korn syle $... syntax is supported? I think our servers still use bash 4.4. Binding \xc2\x85 worked btw!
    – glades
    Commented Mar 29 at 18:29
  • @glades, see edit. Commented Mar 29 at 18:42

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.