RE: Nonsense in http://www.unicode.org/Public/PROGRAMS/CVTUTF/CVT UTF.C?

From: Ayers, Mike (Mike_Ayers@bmc.com)
Date: Wed Aug 22 2001 - 21:03:19 EDT

Previous message: Martin Duerst: "Re: exchanging Arabic data in utf-8"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> From: Michael (michka) Kaplan [mailto:michka@trigeminal.com]
> Sent: Wednesday, August 22, 2001 03:59 PM

> From: "Marcin 'Qrczak' Kowalczyk" <qrczak@knm.org.pl>
>
> > Functions ConvertUCS4toUTF8 and ConvertUTF8toUCS4 use surrogates
> > in UCS4. In particular ConvertUTF8toUCS4 converts a character above
> > U+FFFF into two UCS4 words. Why is this absurd there?!
>
> UCS-4 has no knowledge of surrogate code points or their
> significance; it is
> ap urely algorithmic conversion. Not sure why the results would be so
> surprising, given this?

I know nothing of UCS-4, but if, as the name implies, it uses 4
bytes per word, and needs two of those to represent quantities greater than
0xffff, i.e. 8 bytes to represent a 3 byte quantity, then, yes, I would be
surprised (and as an engineer, disgusted).

/|/|ike

Previous message: Martin Duerst: "Re: exchanging Arabic data in utf-8"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Wed Aug 22 2001 - 21:56:44 EDT