Re: Making use of UTF-16 area for CJK

From: Martin J Duerst (mduerst@ifi.unizh.ch)
Date: Sun Aug 18 1996 - 10:15:11 EDT


Jonathan Rosenne wrote:

>Martin J Duerst wrote:
>>No character standard can solve the problem that new characters get
>>created.
>
>Coding radicals and composition go a long way to help solve this problem.

Hello Jonathan,

You wrote something similar previously, and I thought about answering,
but had no time. But now I think I better should write a few things.

Radical composition techniques can indeed describe a large percentage
of Chinese characters. Within the daily used ones, it's maybe 80%-90%,
within Unicode, it's maybe 95%. The percentage gets higher as you include
rarer characters. The percentage may also be somewhat higher (maybe
also lower, but I doubt it) for characters made up for names.

What is more important, however, is that there is always a remaining
percentage cannot be coded with radical composition techniques.
Of course, you can introduce new composition operators, but this is
not very productive. For example, in a list of about 50'000 characters
from the largest Japanese dictionary (Morohashi), there are I think
three characters that contain a component turned upside-down
(i.e. rotated 180 degrees).

So maybe these techniques "go a long way", but they don't solve
the problem.

Regrads, Martin.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT