Re: Where are we ACTUALLY on Unicode?

From: Jordi Boggiano Date: Sun, 14 Mar 2010 14:23:26 +0000

Subject: Re: Where are we ACTUALLY on Unicode?

References: 1 2 3 Groups: php.internals

Request: Send a blank email to internals+get-47265@lists.php.net to get a copy of this message

On Sun, Mar 14, 2010 at 12:03 PM, Stan Vassilev <sv_forums@fmethod.com> wrote:
> UTF8 also takes 4 bytes for representing characters in the higher bit
> planes, as quite a lot of bits are lost for every char in order to describe
> how long the code point is, and when it ends and so on. This means
> memory-wise it may not be of big benefit to asian countries.

I remember Brian Aker saying that they chose to work internally with
UTF-8 for Drizzle. His explanation of it was that asian countries have
so much english content mixed in that on average even for them UTF-8
still had a lower footprint than UTF-16/32. I do not know where the
stats came from, but if it holds any truth it is worth considering.

Cheers,
Jordi

Thread (27 messages)

Lester CaineSun, 14 Mar 2010 07:28:07 +0000
William A. Rowe Jr.Sun, 14 Mar 2010 07:35:51 +0000
Stan VassilevSun, 14 Mar 2010 11:03:47 +0000
Pierre JoyeSun, 14 Mar 2010 14:00:59 +0000
Jordi BoggianoSun, 14 Mar 2010 14:23:26 +0000
Pierre JoyeSun, 14 Mar 2010 14:33:19 +0000
Moriyoshi KoizumiSun, 14 Mar 2010 14:34:24 +0000
dreamcat fourSun, 14 Mar 2010 14:43:00 +0000
Alexey ZakhlestinMon, 15 Mar 2010 06:20:15 +0000
Stanislav MalyshevMon, 15 Mar 2010 23:33:06 +0000
Lester CaineTue, 16 Mar 2010 08:30:14 +0000
dreamcat fourTue, 16 Mar 2010 11:48:00 +0000
Andrey HristovTue, 16 Mar 2010 12:15:14 +0000
dreamcat fourTue, 16 Mar 2010 17:40:37 +0000
Andrey HristovTue, 16 Mar 2010 18:25:18 +0000
Rasmus LerdorfTue, 16 Mar 2010 18:32:07 +0000
Lester CaineTue, 16 Mar 2010 19:03:38 +0000
dreamcat fourTue, 16 Mar 2010 19:05:39 +0000
Rasmus LerdorfTue, 16 Mar 2010 19:34:56 +0000
Lester CaineTue, 16 Mar 2010 20:39:18 +0000
Pierre JoyeTue, 16 Mar 2010 19:10:56 +0000
William A. Rowe Jr.Tue, 16 Mar 2010 20:42:35 +0000
Stanislav MalyshevTue, 16 Mar 2010 19:05:47 +0000
Ferenc KovacsTue, 16 Mar 2010 20:04:24 +0000
dreamcat fourTue, 16 Mar 2010 20:43:29 +0000
Ferenc KovacsTue, 16 Mar 2010 21:50:51 +0000
Lukas Kahwe SmithWed, 17 Mar 2010 15:29:43 +0000

« previous	php.internals (#47265)	next »

From:	Jordi Boggiano	Date:	Sun, 14 Mar 2010 14:23:26 +0000
Subject:	Re: Where are we ACTUALLY on Unicode?
References:	1 2 3	Groups:	php.internals
Request:	Send a blank email to internals+get-47265@lists.php.net to get a copy of this message