Re: UTF8 vs. Unicode (UTF16) in code

From: Keld Jørn Simonsen (keld@dkuug.dk)
Date: Fri Mar 09 2001 - 05:36:47 EST

Next message: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Previous message: Allan Chau: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe in reply to: Allan Chau: "UTF8 vs. Unicode (UTF16) in code"
Next in thread: Yves Arrouye: "Re: UTF8 vs. Unicode (UTF16) in code"
Reply: Yves Arrouye: "Re: UTF8 vs. Unicode (UTF16) in code"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Fri, Mar 09, 2001 at 10:56:30AM -0800, Yves Arrouye wrote:
>
> Since the U in UTF stands for Unicode, UTF-32 cannot represent more than
> what Unicode encodes, which is is 1+ million code points. Otherwise, you're
> talking about UCS-4. But I
> thought that one of the latest revs of ISO 10646 explicitely specified that
> UCS-4 will never encode more than what Unicode can encode, and thus
> definitely these 4 billion characters you're alluding to.

As far as I know the U in UTF stands for Universal - not unicode.
ISO 10646 can encode characters beyond UTF-16, and should retain
this capability. There is a proposal to restrict UTF-8 to
only encompas the same values as UTF-16, but UCS-4 still encodes
the 31-bit code space.

Kind regards
Keld

Next message: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Previous message: Allan Chau: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe in reply to: Allan Chau: "UTF8 vs. Unicode (UTF16) in code"
Next in thread: Yves Arrouye: "Re: UTF8 vs. Unicode (UTF16) in code"
Reply: Yves Arrouye: "Re: UTF8 vs. Unicode (UTF16) in code"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:20 EDT