How to properly encode strings in java? I'm trying to encode the letter Ü in utf-8 and I'm getting garbage results - d093d19a instead of C39C. What could be the problem?
package org.example;
import java.io.*;
import java.nio.charset.StandardCharsets;
import org.apache.commons.codec.DecoderException;
import org.apache.commons.codec.binary.Hex;
public class Main {
public static String convertStringToHex(String str) {
char[] chars = Hex.encodeHex(str.getBytes(StandardCharsets.UTF_8));
return String.valueOf(chars);
}
public static String convertHexToString(String hex) {
String result = "";
try {
byte[] bytes = Hex.decodeHex(hex);
result = new String(bytes, StandardCharsets.UTF_8);
} catch (DecoderException e) {
throw new IllegalArgumentException("Invalid Hex format!");
}
return result;
}
public static void main(String[] args) throws IOException {
System.out.println("encoder Ü: " + convertStringToHex("Ü"));
System.out.println("decoder Ü error: " + convertHexToString("d093d19a"));
System.out.println("decoder Ü: " + convertHexToString("C39C"));
}
}
Output:
encoder Ü: d093d19a
decoder Ü error: Ü
decoder Ü: ?
I work in windows. I know that windows uses its own encoding, can this affect the final result?
\u00DCinstead ofÜin the java code. It might be that the editor uses an other encoding than the compiler (error).[Console]::InputEncoding = [Console]::OutputEncoding = New-Object System.Text.UTF8EncodingconvertHexToString("C39C")definitely creates the string "\u00DC"` irrespective of your Java default charset. So, since you can't display "Ü", you're not using the same encoding in your output window as Java is using to output it.