Re: ISO 10646 & GB18030 repertoire

From: Christopher Fynn (cfynn@gmx.net)
Date: Fri Jan 07 2005 - 10:15:09 CST

Next message: Peter Kirk: "Re: BrdaRten precomposed Tibetan character set (was Re: ISO 10646 compliance and EU law )"

Previous message: Christopher Fynn: "Re: GB18030 mapping"
In reply to: Christopher Fynn: "Re: ISO 10646 & GB18030 repetoire [was: Re: ISO 10646 compliance and EU law]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Andrew C. West wrote:

> On Thu, 06 Jan 2005 19:16:38 +0000, Christopher Fynn wrote:
>
>> There seems to be one defect - the charts I've seen seem to contain
a pre-composed character equivalent to the combination U+0F68 U+0F7C
U+0F7E - It appears they've assumed that U+0F00 can be used as the
equivalent to that string. However in Unicode U+0F00 is *not* equivalent
to U+0F68 U+0F7C U+0F7E (U+0F00 has no de-composition). I think this
means that there would be no round-trip compatibility for this combination.
>
>
>
> I think that Chris meant to write "the charts I've seen *do not* seem
to contain
> a pre-composed character equivalent to the combination U+0F68 U+0F7C
U+0F7E".
>
> Andrew

Andrew

Yes, thanks, that's what I intended. The *do not* have a pre-composed
character equivalent to *U+0F68 U+0F7C U+0F7E*

If you have a Unicode text which you are converting to GB18030 you more
or less have to convert "U+0F68 U+0F7C U+0F7E" to the GB18030 code which
maps to U+0F00 since "U+0F68 U+0F7C U+0F7E" is not going to display
properly on a system requiring pre-composed characters. U+0F00 in the
original Unicode text would map to the *same* GB18030 code.

If you get a BrdaRten/GB18030 encoded text to convert to Unicode Unicode
do you convert occurrences the character which maps to U+0F00
do you change it to U+0F68 U+0F7C U+0F7E or leave it as U+0F00?

Either way it seems to me there is a problem since the distinction
between U+0F68 U+0F7C U+0F7E & U+0F00 which is there in Unicode (since
U+0F00 has no decomposition) is lost in the process.

regards

- Chris

Next message: Peter Kirk: "Re: BrdaRten precomposed Tibetan character set (was Re: ISO 10646 compliance and EU law )"
Previous message: Christopher Fynn: "Re: GB18030 mapping"
In reply to: Christopher Fynn: "Re: ISO 10646 & GB18030 repetoire [was: Re: ISO 10646 compliance and EU law]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Jan 07 2005 - 10:20:16 CST