RE: how to sort by stroke (not radical/stroke)

From: Andrew C. West (andrewcwest@alumni.princeton.edu)
Date: Sat May 17 2003 - 06:33:25 EDT

Next message: Philippe Verdy: "Re: Unicode conformant character encodings and us-ascii"

Previous message: Allen Haaheim: "Re: John's Own Version of Unicode Conformance, Version 4.0"
Maybe in reply to: Gary P. Grosso: "how to sort by stroke (not radical/stroke)"
Next in thread: Andrew C. West: "Re: how to sort by stroke (not radical/stroke)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Thu, 15 May 2003 21:17:00 +0200, Marco Cimarosti wrote:

> Sure. Anyway, the CJK Radicals Supplement gives a few components which are
> not to be found elsewhere, so maybe the person you were referring to never
> saw them, if he was working with an earlier version of Unicode.

The CJK Radicals Supplement simply provides alternate forms of radicals already
encoded in the Kangxi Radicals block, such as simplified forms, Japanese usage
forms, and variant forms of the same radical that are found in different
positions in the ideographic layout (e.g. U+2E96 is the form of the heart
radical [U+2F3C] when it is a lefthand component of an ideograph, whereas U+2E97
is the form of the heart radical when it is a bottommost component of an
ideograph).

They are useful for describing CJK ideographs in conjunction with Ideographic
Description Characters, but I do not think that there is an actual formalised
Ideographic Description subsystem within Unicode that is intended to be able to
represent all possible CJK ideographs by breaking them down into their component
elements. I would imagine that the Kangxi Radicals and the CJK Radicals
Supplement blocks are intended primarily to be used for typesetting Radical
indexes, and that their usefulness in describing ideographs in conjunction with
IDCs is just an added bonus that was probably not even considered by the UTC
when they were accepted for encoding. (BTW, I never really understood why the
Kangxi Radicals were encoded separately in the first place, given that they are
all duplicates of pre-existing CJK ideographs.)

As I said previously the 214 Kangxi radicals are only a small (albeit important)
subset of all the ideographic components needed to describe CJK ideographs. To
put it in context, the dictionary _Shuowen Jiezi_ compiled by Xu Shen in about
100 A.D. (the first dictionary to use the radical system) has 540 radicals,
whilst the 6th century _Yu Pian_ uses a slightly different set of 542 radicals
(I assume that all of these radicals are encoded within Unicode, but I haven't
checked that yet, and some of them are *very* obscure). Without giving a lecture
on ideographic composition, radicals are only one type of ideographic component,
the other most important type of ideographic component being phonetic elements.
The vast majority of phonetic elements are ideographs in their own right, but
some phonetic elements that have evolved
graphically may differ from the form of the element as a standalone ideograph,
and may thus not be encoded within Unicode. Whilst it may be useful to have such
non-ideographic elements available for describing ideographs in conjunction with
IDCs, I doubt that any proposal for their encoding would get past the UTC
without pre-existing examples of their usage ... and off-hand I can't think of
any examples of textual usage of such unencoded ideographic elements.

I don't know what the 100 or so unencoded ideographic components that my
informant mentions are, but I can give an example of my own. The ideograph
U+8CAC ZE2 is composed of an unencoded element above the ideograph U+8C9D BEI4
(the character's radical). The unencoded element is actually an evolution of the
ideograph U+673F CI4, which acts as the character's phonetic [see Karlgren's
Grammata Serica #868]. (In the ideograph U+6BD2 DU2 "poisonous", the same
unencoded element above the ideograph U+6BCB WU2 "not, without" is probably
derived from the ideograph U+751F SHENG1 "life", the whole character being a
rhebus for "not life").

Regards,

Andrew

Next message: Philippe Verdy: "Re: Unicode conformant character encodings and us-ascii"
Previous message: Allen Haaheim: "Re: John's Own Version of Unicode Conformance, Version 4.0"
Maybe in reply to: Gary P. Grosso: "how to sort by stroke (not radical/stroke)"
Next in thread: Andrew C. West: "Re: how to sort by stroke (not radical/stroke)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat May 17 2003 - 07:20:12 EDT