Re: Problems/Issues with CJK and Unicode

From: Rick McGowan (
Date: Fri Apr 07 2000 - 14:43:00 EDT

Hoon Kim said:

> "Sort" would be one of those problem.  
> (For Korean and Japanese, you would expect to sort by pronunciation, which
> would be different than the order Unihan characters were placed on)

Yes, but... that's not the whole story. I don't know aobut Korean, and
maybe Hoon Kim would provide an elaboration.

Sorting in Japanese is impossible based purely on Han characters anyway.
Any real-world sorting by pronunciation (e.g. for databases) uses a kana or
romaji sort-key on the side, because the proper pronunciation is not
algorithmically derivable from the Kanji alone.

There is no existing coded character set for Japanese that can guarantee
correct sorting based only on code-point ordering. I think Unicode is
therefore not inferior to any other existing encoding for the sorting of
Japanese data.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:01 EDT