Re: CJK Ideograph Fragments

From: Uriah Eisenstein (
Date: Sat May 08 2010 - 13:44:59 CDT

  • Next message: Doug Ewell: "Re: CJK Ideograph Fragments"

    I've gone through the policies of submitting new characters and scripts and
    they don't look encouraging :) But neither do they seem to reject the idea
    of character fragments out of hand, as opposed to the reverse case -
    characters which can be expressed using existing characters and combining
    marks. In fact, the CJK Radicals Supplement block and the Hangul Jamo both
    contain character fragments, in a way. But I suppose these already existed
    in national standards rather than introduced by Unicode.

    In any case, examples I've seen of proposals cite experts and provide font
    makers, neither of whom I have contact with. So I guess I'll drop it for
    now, and hope that if someone takes it up I'll see it on the mailing list.


    On Sun, May 2, 2010 at 3:06 PM, Uriah Eisenstein

    > Not exactly, but I suppose such Hanzi fragments could be sued for similar
    > purposes - e.g. looking up characters by components, where the available
    > components may include non-character fragments. Some fragments may be useful
    > for IME purposes, but probably not all.
    > On Sat, May 1, 2010 at 8:57 PM, Edward Cherlin <> wrote:
    >> 2010/4/28 John H. Jenkins <>:
    >> > No. You could certainly write up a proposal and submit it to the UTC.
    >> > Should the UTC feel the idea has merit, it would then move it on to WG2
    >> > and/or the IRG.
    >> > The main problem here is that there is a very strong desire to limit
    >> > ideograph encoding to attested and documentable forms. Anything which
    >> does
    >> > not exist in actual texts is not likely to be well-regarded.
    >> I had the idea some years ago of writing up a proposal to encode the
    >> hanzi fragments used in Cangjie Shurufa IMEs. These fragments are used
    >> extensively in dozens of howto books on keyboarding in Cangjie. This
    >> includes the pieces (mostly real characters, with some radicals) used
    >> on keyboard labels, and the common forms they stand for. I didn't get
    >> any interest from the Cangjie development community or the authors of
    >> a book on Cangjie that I have, so i abandoned the idea.
    >> Uriah, is this the sort of thing you have in mind?
    >> > Similarly, the
    >> > UTC has a strong preference not to encoding anything which isn't in
    >> actual
    >> > use. Proposals to encode characters because they would be useful if
    >> encoded
    >> > even though they aren't actually being used right now are generally
    >> looked
    >> > on with disfavor.
    >> >
    >> > 在 Apr 28, 2010 12:03 PM 時, Uriah Eisenstein 寫到:
    >> >
    >> > Hello,
    >> > My question is about common components of CJK Ideographs which are not
    >> > encoded as independent Han characters (and perhaps indeed aren't). A
    >> good
    >> > example is the right-hand part of the character 漢 itself: it is a
    >> distinct
    >> > component appearing in multiple other characters, but is not encoded to
    >> the
    >> > best of my knowledge. The same goes for the top part of 鳥 and 島, the
    >> > surrounding part of 與 and 興 and several others. My question is whether
    >> there
    >> > are any plans or discussions for encoding these fragments in Unicode.
    >> >
    >> > (I haven't found anything about this in mailing list archives; I did
    >> find
    >> > statements that Unicode does not intend to provide any decomposition
    >> data of
    >> > Han characters :) And for good reasons. However, such fragments may well
    >> be
    >> > useful for third-party software dealing with 漢字 glyph generation, lookup
    >> by
    >> > components etc.)
    >> >
    >> > Thanks,
    >> > Uriah Eisenstein
    >> >
    >> >
    >> --
    >> Edward Mokurai (默雷/धर्ममेघशब्दगर्ज/دھرممیگھشبدگر ج) Cherlin
    >> Silent Thunder is my name, and Children are my nation.
    >> The Cosmos is my dwelling place, the Truth my destination.

    This archive was generated by hypermail 2.1.5 : Sat May 08 2010 - 13:51:02 CDT