Re: Unicode, SMS and year 2012

From: Doug Ewell <>
Date: Sun, 29 Apr 2012 15:22:14 -0600

Szelp, A. Sz. wrote:

>> Some people are simply opposed to additional encoding schemes. The
>> HTML5 specification explicitly forbids the use of UTF-32, SCSU, and
>> BOCU-1 (while allowing many non-Unicode legacy encodings and quietly
>> mapping others to Windows encodings); one committee member was quoted
>> as saying that other encodings of Unicode "waste developer time."
> While there are good reasons the authors of HTML5 brought to ignore
> SCSU or BOCU-1, having excluded UTF-32 which is the most direct,
> one-to-one mapping of Unicode codepoints to byte values seems
> shortsighted. We are talking about the whole of Unicode, not just BMP.

All UTFs (8, 16, 32) can represent all of Unicode, as can SCSU. The only
Unicode encoding that can represent only the BMP is UCS-2, which AFAIK
is no longer endorsed by UTC.

Doug Ewell | Thornton, Colorado, USA | @DougEwell ­
Received on Sun Apr 29 2012 - 16:26:17 CDT

This archive was generated by hypermail 2.2.0 : Sun Apr 29 2012 - 16:26:18 CDT