Support for non-BMP characters

From: David Starner <>
Date: Wed, 25 Apr 2012 01:31:40 -0700

It's been ten years since the first non-BMP characters were encoded.
How are they working in your neck of the woods? There's a lot of
places where they're working just fine, but I was facing MySQL's
support. It has had support for UCS-2 and UTF-8 limited to the BMP for
a long time; now in MySQL 5.5 there's utf16, utf32 and utf8mb4. (MySQL
5.1 and 5.5 are the current stable releases.) But there's enough
warnings about incompatibilities with utf8mb4 to make me pause before
switching my private database to it, and I think the net will see
MySQL databases with utf8 instead of utf8mb4 as long as MySQL exists,
unless they decide to push people over to it.

(Ada's an issue too, though not one most people will have to deal
with. While Ada 2005 added a UTF-32 string type, it left the UCS-2
string type as is. Again, I suspect a lot of nominally Unicode Ada
programs are going to BMP-only. Of course, UTF-8 as an ASCII superset
is used, stuffed into strings labeled Latin-1; it's technically not
conformant with the Ada standard but it works so long as you don't
need much string processing.)

In any case, is the use of non-BMP characters still problematic in
your corner of the computing world or is everything looking fine from
where you are?

Kie ekzistas vivo, ekzistas espero.
Received on Wed Apr 25 2012 - 03:40:18 CDT

This archive was generated by hypermail 2.2.0 : Wed Apr 25 2012 - 03:40:31 CDT