If you have files with Cyrillic filenames (e.g. день) and pack them as a ZIP archive on Windows, and then unpack on Mac using the standard archive utility, the filenames are often in the wrong encoding. For example: бвгѓ•≠м
Here is a bash script that renames them to the correct ones:
function rename() {
tr '†°Ґ£§•с¶І®©™Ђђ≠а-р' 'а-еёж-нр-яЁ' <<< "$1" | sed $'s/Г\xcc\x81/о/g;s/у\xcc\x81/п/g;s/ш\xcc\x86/щ/g'
}
function renamefile() {
local new="$(rename "$2")"
if [[ "$2" != "$new" ]]; then
mv "$1/$2" "$1/$new"
echo "$new"
fi
}
function scan() {
ls -1 "$1" | while read file; do
if [ -d "$1/$file" ]; then
scan "$1/$file"
fi
renamefile "$1" "$file"
done
}
scan "${1-.}"
Usage:
<script> <dir_with_files_with_wrong_filenames>
However, some users complained:
You can't run it twice - the names will be corrupted again.
I threw the script into the Downloads directory, launched it, but for some reason it started renaming from the root directory instead - and corrupted filenames EVERYWHERE.
I then replaced
scan "${1-.}"
with
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" &> /dev/null && pwd)
scan "${1-${SCRIPT_DIR}}"
But I'm not sure this really fixes the second issue and also is generally safe enough. Could someone make a good safety review?
mac_cyrillic. \$\endgroup\$