Re: Codes for Individual Chinese Brushstrokes

From: Andrew C. West (
Date: Fri Feb 20 2004 - 06:57:00 EST

  • Next message: Frank Yung-Fong Tang: "Re: Codes for Individual Chinese Brushstrokes"

    On Thu, 19 Feb 2004 18:27:09 -0800 (PST), Kenneth Whistler wrote:
    > Of the 64 entities listed on the page:
    > *none* of them are encoded, and *none* of them are "standard"
    > enough to merit consideration -- if by consideration you mean
    > separate encoding as characters.

    I'm not sure about "*none* of them are encoded". As far as I can tell, pretty
    much most of the basic ideographic stroke forms are either already encoded in
    CJK and CJK-B or are proposed in CJK-C (where "encoded" here means "encoded in
    their own right" or "can be represented by same-shaped ideographs").

    See for example the IRG document
    which states :

    Although most ideographic strokes have been encoded in CJK (including Ext.A and
    B) or submitted to CJK_C1 by IRG members, there are two ideographic strokes are
    found missing. Ideographic strokes are important for ideograph decomposition,
    analysis and for making ideographic strokes subset. Chinese linguists suggest to
    add these two ideographic strokes to CJK_C1.

    I also remember reading one WG2 document that explicitly raised the question of
    how to deal with all the ideographic strokes proposed in CJK-C that are not
    distinct ideographs in their own right, although I can't seem to locate that
    document any more.

    All except one of the eight basic strokes mentioned at
    <> are *representable*
    using existing characters in the CJK and/or Kangxi Radicals blocks :

    dot = U+4E36 or U+2F02 [KANGXI RADICAL DOT]
    dash = U+4E00 or U+2F00 [KANGXI RADICAL ONE]
    perpendicular downstroke = U+4E28 or U+2F01 [KANGXI RADICAL LINE]
    downstroke to the left or left-falling stroke = U+4E3F or U+2F03 [KANGXI RADICAL
    wavelike stroke or right-falling stroke = U+4E40
    hook = U+4E85 or U+2F05 [KANGXI RADICAL HOOK], as well as U+4E5A and U+2010C
    upstroke to the right =
    bend or twist = U+4E5B and U+200CC

    I concur with Ken that the 8x8 stroke categorization given at this web site is
    largely artificial. Whilst it may be useful to encode general ideographic stroke
    forms to help in the analysis and decomposition of ideographs, in my opinion the
    minute distinctions in the way that dots and dashes are written in various
    individual ideographs are beyond the scope of a character encoding system as the
    exact shape of a dot or length of a dash is irrelevant to any analysis of the
    compositional structure of an ideograph.


    This archive was generated by hypermail 2.1.5 : Fri Feb 20 2004 - 07:59:29 EST