RE: UTF-16 inside UTF-8

From: [email protected]
Date: Wed Dec 03 2003 - 04:36:32 EST

Next message: Arcane Jill: "RE: MS Windows and Unicode 4.0 ?"

Previous message: D. Starner: "RE: UTF-16 inside UTF-8"
Maybe in reply to: Philippe Verdy: "RE: UTF-16 inside UTF-8"
Next in thread: Doug Ewell: "Re: UTF-16 inside UTF-8"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> We're not speaking about the same thing: I was not discussing the
> representation of individual characters (yes it's simple to make
> wchar_t 32-bit with UCS4), but the encoding of large amounts of
> strings for general text processing. That's where UTF-16 is better.

        For some values of "better", and for some values of "text processing".
        Because UTF-16 is variable width, it can be slow for certain string operations:
        basically anything that requires "random access" to the string, like "give me the substring
        from (code point) the position 1000 to the position 1999". Unless you have some sort of
        caching, or something else clever, you'll be O(position) instead of O(1).

Next message: Arcane Jill: "RE: MS Windows and Unicode 4.0 ?"
Previous message: D. Starner: "RE: UTF-16 inside UTF-8"
Maybe in reply to: Philippe Verdy: "RE: UTF-16 inside UTF-8"
Next in thread: Doug Ewell: "Re: UTF-16 inside UTF-8"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Dec 03 2003 - 05:26:05 EST