On Wed, Sep 06, 2000 at 08:13:41AM -0800, Markus Scherer wrote:
> of this list, only UTF-EBCDIC is a viable encoding form.
> the others are either deprecated, never made it beyond draft, or are unofficial discussion pieces that never made it anywhere (i proposed one of them :-).
> if you detect all the big- and little-endian boms for the standard forms
> utf-8, utf-16, utf-32, scsu, utf-ebcdic
> then you will be a hero. any of them may come with a bom depending on protocol and os.
> David Starner wrote:
> > > UTF-1: F7 64 4C
> > > UTF-7: 2B 2F 76 38 2D "+/v8-"
> > > UTF-7d5: BF FB FF
> > > UTF-8C1: BB ED DF
> > > UTF-9: 93 FD FF
> > > UTF-EBCDIC: DD 73 66 73
> > > UTF-mu(2): 9F 9B FF
> > > UCN(3): 5C 75 66 65 66 66 "\ufeff"
> > > DUCK(4): 81 FE FF
I realize some of these were more discussion pieces; honestly, I was
planning on implementing SCSU, UTF-1, UTF-7 and 8/16/32 BE/LE. Why
UTF-EBCDIC? I would think that UTF-7 is more common in use, as once in
a while you'll run across it in mail and newsgroups. I feel a need to
at least UTF-7, in case someone wants to write a mail reader with Ngeadal.
-- David Starner - email@example.com http/ftp: dvdeug.dhis.org I knew all of the floors in my high school, and none of the ceilings. - Chris Painter
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT