Skip to main content
Question Protected by CommunityBot
deleted 58 characters in body
Source Link
Braiam
  • 36.9k
  • 29
  • 114
  • 176

(Question self-migrated from superuser.com)

Someone sent me a ZIP file containing files with Hebrew names (and created on Windows, not sure with which tool). I use LXDE on Debian Stretch. The Gnome archive manager manages to unzip the file, but the Hebrew characters are garbled. I think I'm getting UTF-8 octets extended into Unicode characters, e.g. I have a file whose name has four characters and a .doc suffic, and the characters are: 0x008E 0x0087 0x008E 0x0085 . Using the command-line unzip utility is even worse - it refuses to decompress altogether, complaining about an "Invalid or incomplete multibyte or wide character".

So, my questions are:

  • Is there another decompression utility that will decompress my files with the correct names?
  • Is there something wrong with the way the file was compressed, or is it just an incompatibility of ZIP implementations? Or even misfeature/bug of the Linux ZIP utilities?
  • What can I do to get the correct filenames after having decompressed using the garbled ones?

(Question self-migrated from superuser.com)

Someone sent me a ZIP file containing files with Hebrew names (and created on Windows, not sure with which tool). I use LXDE on Debian Stretch. The Gnome archive manager manages to unzip the file, but the Hebrew characters are garbled. I think I'm getting UTF-8 octets extended into Unicode characters, e.g. I have a file whose name has four characters and a .doc suffic, and the characters are: 0x008E 0x0087 0x008E 0x0085 . Using the command-line unzip utility is even worse - it refuses to decompress altogether, complaining about an "Invalid or incomplete multibyte or wide character".

So, my questions are:

  • Is there another decompression utility that will decompress my files with the correct names?
  • Is there something wrong with the way the file was compressed, or is it just an incompatibility of ZIP implementations? Or even misfeature/bug of the Linux ZIP utilities?
  • What can I do to get the correct filenames after having decompressed using the garbled ones?

Someone sent me a ZIP file containing files with Hebrew names (and created on Windows, not sure with which tool). I use LXDE on Debian Stretch. The Gnome archive manager manages to unzip the file, but the Hebrew characters are garbled. I think I'm getting UTF-8 octets extended into Unicode characters, e.g. I have a file whose name has four characters and a .doc suffic, and the characters are: 0x008E 0x0087 0x008E 0x0085 . Using the command-line unzip utility is even worse - it refuses to decompress altogether, complaining about an "Invalid or incomplete multibyte or wide character".

So, my questions are:

  • Is there another decompression utility that will decompress my files with the correct names?
  • Is there something wrong with the way the file was compressed, or is it just an incompatibility of ZIP implementations? Or even misfeature/bug of the Linux ZIP utilities?
  • What can I do to get the correct filenames after having decompressed using the garbled ones?
Source Link
einpoklum
  • 11.1k
  • 23
  • 91
  • 172

How can I correctly decompress a ZIP archive of files with Hebrew names?

(Question self-migrated from superuser.com)

Someone sent me a ZIP file containing files with Hebrew names (and created on Windows, not sure with which tool). I use LXDE on Debian Stretch. The Gnome archive manager manages to unzip the file, but the Hebrew characters are garbled. I think I'm getting UTF-8 octets extended into Unicode characters, e.g. I have a file whose name has four characters and a .doc suffic, and the characters are: 0x008E 0x0087 0x008E 0x0085 . Using the command-line unzip utility is even worse - it refuses to decompress altogether, complaining about an "Invalid or incomplete multibyte or wide character".

So, my questions are:

  • Is there another decompression utility that will decompress my files with the correct names?
  • Is there something wrong with the way the file was compressed, or is it just an incompatibility of ZIP implementations? Or even misfeature/bug of the Linux ZIP utilities?
  • What can I do to get the correct filenames after having decompressed using the garbled ones?