From: Christopher Fynn (cfynn@gmx.net)
Date: Fri Jan 07 2005 - 08:53:54 CST
Andrew C. West wrote:
> On Thu, 06 Jan 2005 19:16:38 +0000, Christopher Fynn wrote:
>
>>There seems to be one defect - the charts I've seen seem to contain a
>>pre-composed character equivalent to the combination U+0F68 U+0F7C
>>U+0F7E - It appears they've assumed that U+0F00 can be used as the
>>equivalent to that string. However in Unicode U+0F00 is *not* equivalent
>>to U+0F68 U+0F7C U+0F7E (U+0F00 has no de-composition). I think this
>>means that there would be no round-trip compatibility for this combination.
>
>
> I think that Chris meant to write "the charts I've seen *do not* seem to contain
> a pre-composed character equivalent to the combination U+0F68 U+0F7C U+0F7E".
>
> Andrew
Andrew
Yes, thanks, that's what I intended. The *do not* have a pre-composed
character equivalent to U+0F7E
If you have a Unicode text which you are converting to GB18030 you more
or less have to convert "U+0F68 U+0F7C U+0F7E" to the GB18030 code which
maps to U+0F00 as "U+0F68 U+0F7C U+0F7E" is not going to display
properly on a system requiring pre-composed characters. U+0F00 in the
original Unicode text would map to the *same* GB18030 code.
If you get a BrdaRten/GB18030 encoded text to convert to Unicode
Unicode do you convert occurrences the character which maps to U+0F00
do you change it to U+0F68 U+0F7C U+0F7E or leave it as U+0F00?
Either way it seems to me there is a problem since the distinction
between U+0F68 U+0F7C U+0F7E & U+0F00 which is there in Unicode (since
U+0F00 has no decomposition) is lost in the process.
regards
- Chris
This archive was generated by hypermail 2.1.5 : Fri Jan 07 2005 - 09:01:27 CST