0

I'm trying to translate my bash scripts using the gettext tools but I have a problem where the encoding seems to be wrong.

Let's say I have the following file called fr.po:

# French translations for my-package package
# Traductions françaises du paquet my-package.
# Copyright (C) 2025 THE my-package'S COPYRIGHT HOLDER
# This file is distributed under the same license as the my-package package.
# Automatically generated, 2025.
#
msgid ""
msgstr ""
"Project-Id-Version: my-package v0.0.1\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2025-11-25 20:36-0500\n"
"PO-Revision-Date: 2025-11-25 17:58-0500\n"
"Last-Translator: Automatically generated\n"
"Language-Team: none\n"
"Language: fr\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n > 1);\n"

msgid "test-message"
msgstr "a à e é è ê ë i î ï o ô ö u ù û ü c ç n ñ"

Then I execute the following:

file --mime ./fr.po # output: ./fr.po: text/x-po; charset=utf-8
msgfmt --output-file='/usr/share/locale/fr/LC_MESSAGES/my-test.mo' ./fr.po

export TEXTDOMAINDIR=/usr/share/locale
export TEXTDOMAIN=my-test
export LANG=fr_CA.UTF-8
export LC_ALL=fr_CA.UTF-8

# The following command works as intended and prints this:
# a à e é è ê ë i î ï o ô ö u ù û ü c ç n ñ
gettext test-message


# However if I use the same command within a string or in a pipeline I get this:
# a □ e □ □ □ □ i □ □ o □ □ u □ □ □ c □ n
printf "$(gettext test-message)"
echo "$(gettext test-message)"
gettext test-message | cat
cat <(gettext test-message)

####

gettext test-message > out.txt
cat out.txt # output: a □ e □ □ □ □ i □ □ o □ □ u □ □ □ c □ n
file --mime out.txt # output: out.txt: text/plain; charset=iso-8859-1

As you can see in the last 3 lines above, gettext seems to encode my message in ISO-8859-1 which is not what I want.

How can I force gettext to give me my message in UTF-8 ? Or how can I work around this issue ?

I tried changing the terminal encoding with chcp.com 65001 but it didn't change anything. I also tried to place the .mo file in /usr/share/locale/fr.utf-8/... but with no avail.

I saw this question too which seems awfully close to my problem but I couldn't find any equivalent to bind_textdomain_codeset that I could call from a bash script.


Here's the content of my-test.mo in UTF-8:

ޒ          ,      <       H      I   3  V   7                test-message Project-Id-Version: my-package v0.0.1
Report-Msgid-Bugs-To: 
PO-Revision-Date: 2025-11-25 17:58-0500
Last-Translator: Automatically generated
Language-Team: none
Language: fr
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Plural-Forms: nplurals=2; plural=(n > 1);
 a à e é è ê ë i î ï o ô ö u ù û ü c ç n ñ  

Update

I found a workaround using iconv. For example:

printf "$(gettext test-message | iconv -f iso-8859-1 -t utf-8)"

However I suspect this workaround must only be used if the script is executed with git-bash (windows) since this problem doesn't exist on linux. On a linux system (and probably WSL) the output of gettext is already in UTF-8 so converting it again would probably result in an error or a bad output.

So I'm still looking for a better alternative where I wouldn't have to wrap gettext in a function to check which OS the script is beeing executed on.

8
  • Your example worked for me as expected in all the test cases (but I had to put sudo before msgfmt). As far as I'm aware of, I also don't have set any other environment variables impacting my locale, and all entries printed by locale show the value fr_CA.UTF-8. Using Bash 5.3.3 and gettext-tools 0.26 Commented Nov 26 at 19:06
  • @pmf I think my problem only occurs on windows (git-bash) and I assume you're on linux or WSL since you had to use sudo which doesn't work in git-bash or at least not out-of-the-box. Also I have git's latest version which comes with bash 5.2.37 Commented Nov 26 at 19:12
  • Right, I must have overlooked the windows and git-bash tags, sorry. Commented Nov 26 at 19:15
  • What is the content of my-test.mo? Commented Nov 26 at 20:59
  • @Philippe It's the output of msgfmt. I added it to the question. Commented Nov 26 at 21:47

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.