Newest 'character-encoding' Questions - Software Engineering Stack Exchange

16 votes

4 answers

7k views

Why should C++ uint8_t data not be printable?

On this github C++ related page the writer said Note that the value_type of those two containers is uint8_t which is not a printable character, make sure to cast it to int before you print. Why ...

Russell McMahon

273

asked Dec 1, 2024 at 12:26

-1 votes

1 answer

550 views

Should a Java project use UTF-16? [closed]

Java, by default, uses UTF-16 to represent characters in the String data type. I inherited a JavaFX project which currently has some Strings in UTF-8 and others in UTF-16. This is causing bugs (in pop-...

chilliefiber

99

asked May 4, 2024 at 16:56

0 votes

1 answer

443 views

Why does Windows need CR LF to advance to next line, but Python does not?

Windows: Uses CR (\r) in combination with LF (\n) for line endings, facilitating compatibility with legacy systems. Unix-like systems (including Linux and macOS): Employs LF (\n) for line endings, ...

DevelBase2

119

asked Feb 16, 2024 at 14:52

1 vote

1 answer

672 views

Does the SHA256 hashing algorithm change based on the content encoding?

I am starting to look into how to implement SHA256 in JavaScript, and found this for example. It requires UTF-8 encoding it sounds like. Another one I saw required/supported only ASCII encoding and ...

Lance Pollard

2,787

asked May 29, 2020 at 10:07

1 vote

1 answer

87 views

Layout Behavior of Characters (question about unicode standard)

I've been reading Unicode's core specification (see https://www.unicode.org/versions/latest/). I mostly understood what the text was explaining in section 2.1 Architectural Context until it started ...

lonious

121

asked Feb 15, 2020 at 23:45

2 votes

2 answers

571 views

Compressing EBCDIC file vs UTF8

Today I went across a weird case for which I have no explanation, so here I am. I have two files with identical content, but one is encoded in UTF-8 and the other one is in IBM EBCDIC. Both of them ...

rodripf

137

asked Nov 21, 2019 at 18:32

5 votes

1 answer

428 views

UTF-8 questions

When you encode a code point to code units based on UTF-8, then if the code point fits on 7 bits, the most significant bit is set to zero so that it tells you it is a character which is stored on 1 ...

codepersonnel49

69

asked Nov 15, 2019 at 22:03

1 vote

2 answers

106 views

In python, what or who is character encoding information for?

If you go to www.htmlbasictutor.ca/character-encoding.htm you will find the following description of character encoding. Character encoding tells the browser and validator what set of characters to ...

progner

523

asked Sep 19, 2019 at 2:07

2 votes

2 answers

153 views

What's the difference between the range of characters you can use in a script and a script's encoding?

The two concepts seem equal to me, but I'm not really sure I understand encoding well enough to confirm that this is the case.

progner

523

asked Sep 18, 2019 at 23:51

1 vote

1 answer

365 views

Create and implement new encoding

I'm working on a project with huge files that contain only the set {[0-9],.}. Encoding in UTF-8 or ASCII make huge files. I wonder if I could find a way to encode in only 4 bits (make those file 16 ...

PyThagoras

123

asked Nov 14, 2018 at 3:03

2 votes

1 answer

443 views

Differentiating Between ASCII and Unicode in File Spec

I am developing against a file spec that lists the data type for certain fields as CHAR(<length>) The spec is for a fixed width flat file. In most cases, possible values to populate the fields ...

mathewb

137

asked Aug 22, 2018 at 17:11

0 votes

2 answers

2k views

Why Unicode Encoding/Decoding is Necessary in JavaScript

I am wondering why unicode encoding is necessary in JavaScript. I am looking at utf8.js as an example. I am also looking at the utf8 spec, but am not really following the different pieces of data. ...

Lance Pollard

2,787

asked Jul 23, 2018 at 21:45

0 votes

1 answer

1k views

Java takes 2 bytes to represent character?

In general a character is represented in 1 byte i.e. 8 bits . This is I believe true for all text editors even for databases like oracle. 1 byte can represent 2^8 = 256 Characters. My question is when ...

user3198603

1,896

asked Jul 6, 2018 at 14:31

8 votes

1 answer

4k views

Is the BOM optional for UTF-16 and UTF-32?

I used to think that the BOM is optional for UTF-8, but mandatory for UTF-16 and UTF-32. But then I have read the following (in this article): Let's look just at the ones that Notepad supports. ...

user9002947

249

asked Apr 28, 2018 at 5:11

0 votes

1 answer

327 views

Barcode that support ~3500 chars [closed]

I can't figure out a barcode that would support ~3500 chars. The barcode should contain 40 strings with caret return, each 76 chars long. Each string will look like this: ...

SovereignSun

101

asked Oct 20, 2017 at 7:25

Stack Exchange Network

Questions tagged [character-encoding]

Why should C++ uint8_t data not be printable?

Should a Java project use UTF-16? [closed]

Why does Windows need CR LF to advance to next line, but Python does not?

Does the SHA256 hashing algorithm change based on the content encoding?

Layout Behavior of Characters (question about unicode standard)

Compressing EBCDIC file vs UTF8

UTF-8 questions

In python, what or who is character encoding information for?

What's the difference between the range of characters you can use in a script and a script's encoding?

Create and implement new encoding

Differentiating Between ASCII and Unicode in File Spec

Why Unicode Encoding/Decoding is Necessary in JavaScript

Java takes 2 bytes to represent character?

Is the BOM optional for UTF-16 and UTF-32?

Barcode that support ~3500 chars [closed]

Hot Network Questions