Re: How is UTF8, UTF16 and UTF32 encoded?

From: i18nGuy Tex Texin (tex@i18nguy.com)
Date: Thu May 30 2002 - 15:59:52 EDT


Dear Mr Ecartis,
;-)

Just fyi, Theodore, usually at the Unicode conferences there are a
couple presentations that discuss encodings and provide some examples.
You might check the conference pages, some of the older conferences have
presentations online. The next conference is in Sept. and will also
likely have some coverage of this.
(I understand Sept. is a ways away, if you need the info now, but I
thought its existence worth pointing out.)
Anyway, check some of the older conference pages for the online copies.

tex
"Theodore H. Smith" wrote:
>
> > Many of the explanations of UTF-8 discuss encoding of code
> > points on Code
> > Planes 1-16 using the intermediate concept of surrogates as in
> > UTF-16. I
> > believe that this is both unnecessary and misleading, as UTF-8 is
> > fundamentally a direct 21-bit encoding scheme, as may be seen in the
> > attached document. So, I believe that the concept of surrogates is not
> > relevant for UTF-8 encoding on Code Planes above the BMP.
> >
> > This is a slightly different explanation of how UTF-8 works,
> > written by me
> > for the Ultracode(r) bar code spec (Ultracode encodes all of Unicode 3
> > directly). If any Unicodotti find any errors in it... please
> > let me know!
>
> You sent me a file that explains things, but its in word format
> (I think,
> its .doc) and I don't have MS Word. I have very few MS things
> fortunately.
> Just MSIE is all.
>
> Thanks anyhow. This whole bit encoding is kind of technical, and I guess
> I could do my own calculations and stuff to get some kind of
> feel for what
> the conversion code does to a character, but I was hoping more for some
> illustrative examples. Like, lets say we take character XX, and so first
> we see how many trailing chars it has like this, and etc giving a step
> by step example... Almost like code but with the intermediate values
> listed and explained.
>
> (Once again I almost sent this to ecartis)
>
> --
> Theodore H. Smith - Macintosh Consultant / Contractor.
> My website: <www.elfdata.com/>

-- 
-------------------------------------------------------------
Tex Texin   cell: +1 781 789 1898   mailto:Tex@i18nGuy.com
Xen Master                          http://www.i18nGuy.com
                         
XenCraft		            http://www.XenCraft.com
Making e-Business Work Around the World
-------------------------------------------------------------
What's wrong with locales?
http://www.i18nguy.com/locales/index.html

Spam marketing is like fishing by dropping heavy rocks in the water, hoping it will land on one. Mostly it moves fish to less disturbed waters.



This archive was generated by hypermail 2.1.2 : Thu May 30 2002 - 14:23:47 EDT