Unicode education in the professional world

From: Doug Ewell via Unicode <unicode_at_unicode.org>
Date: Fri, 07 Jul 2017 10:02:35 -0700

Sort of along the lines of "education"...

I've been helping a colleague who is using the Oracle database and
trying to work through a customer's character conversion and mojibake
issues. I started suspecting the NLS_LANG variable and looked up some
references, and found the following alternative facts on the Oracle FAQ
and community pages:

> SQL> SELECT DUMP(col,1016)FROM table;
>
> Typ=1 Len=39 CharacterSet=UTF8: 227,131,143,227,131,170
>
> returns the value of a column consisting of 3 Japanese characters in
> UTF8 encoding . For example the 1st char is 227(*255)+131.

and:

> While UTF8 uses only 2 bytes to store data AL32UTF8 uses 2 or 4 bytes.

Unicode and UTF-8 have been around a long time by now. The fact that
there is still fake news like this out there, steering our less
Unicode-aware colleagues waaay down the wrong path, is disconcerting.

--
Doug Ewell | Thornton, CO, US | ewellic.org
Received on Fri Jul 07 2017 - 12:03:28 CDT

This archive was generated by hypermail 2.2.0 : Fri Jul 07 2017 - 12:03:28 CDT