RE: UTF-16 inside UTF-8

From: jon@hackcraft.net
Date: Wed Dec 03 2003 - 05:28:21 EST

Next message: Asmus Freytag: "Re: Korean compression (was: Re: Ternary search trees for Unicode dictionaries)"

Previous message: Arcane Jill: "RE: MS Windows and Unicode 4.0 ?"
In reply to: Philippe Verdy: "RE: UTF-16 inside UTF-8"
Next in thread: D. Starner: "RE: UTF-16 inside UTF-8"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> So you can have a wchar_t datatype in C/C++ that stores UCS-4, but
> your strings will most often not be arrays of wchar_t but of an
> intermediate 16-bit size which gets parsed to 32-bit wchar_t by
> very simple run-time scanners.

If wchar_t maps to UCS-4 then wchar_t* will map to UCS-4 and all of the C
runtime support for string handling use wchar_t* for "wide" characters.

It would not be possible to implement std::wstring on a system for which
wchar_t was 32bits with it internally using a 16-bit unit storing UTF-16 since
you are required to provide random-access iterators into the code-units and
such an implementation could only provide bi-directional iterators (you could
do weird things like storing indexes into the internal string, but the only
reason for doing that would be to show that you could; the result would be
worse in every way to an more straight-forward implementation that stored
wchar_t characters).

--
Jon Hanna                   | Toys and books
<http://www.hackcraft.net/> | for sick children:
                            | <http://santa.boards.ie/>

Next message: Asmus Freytag: "Re: Korean compression (was: Re: Ternary search trees for Unicode dictionaries)"
Previous message: Arcane Jill: "RE: MS Windows and Unicode 4.0 ?"
In reply to: Philippe Verdy: "RE: UTF-16 inside UTF-8"
Next in thread: D. Starner: "RE: UTF-16 inside UTF-8"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Dec 03 2003 - 06:22:42 EST