Re: Codes for Individual Chinese Brushstrokes

From: Frank Yung-Fong Tang (
Date: Fri Feb 20 2004 - 10:13:14 EST

  • Next message: Michael Everson: "Re: Codes for Individual Chinese Brushstrokes"

    As a native Chinese person. I believe
    1. The so called "eight basic stroke" is very "standard" in concept.
    But that is only 8.
    2. They list 8 different varients for each of the 8 "basic stroke". But
    if you read that page carefully, it does not mean that there are only 8
    variants for each stroke, neither mean people can distinguish those
    variants from each others. For example, most Chinese will think the
    first "Dot" from the left is the same as the fourth "Dot" from the left.
      And the differents between them are really "style". Therefore, it is
    not a good idea to encode those "variants"
    3. There are more composit strokes if you really want to encode
    strokes. For example:

    Andrew C. West wrote:

    > On Thu, 19 Feb 2004 18:27:09 -0800 (PST), Kenneth Whistler wrote:
    > >
    > > Of the 64 entities listed on the page:
    > >
    > >
    > >
    > > *none* of them are encoded, and *none* of them are "standard"
    > > enough to merit consideration -- if by consideration you mean
    > > separate encoding as characters.
    > >
    > I'm not sure about "*none* of them are encoded". As far as I can tell,
    > pretty
    > much most of the basic ideographic stroke forms are either already
    > encoded in
    > CJK and CJK-B or are proposed in CJK-C (where "encoded" here means
    > "encoded in
    > their own right" or "can be represented by same-shaped ideographs").
    > See for example the IRG document

    > which states :
    > <quote>
    > Although most ideographic strokes have been encoded in CJK (including
    > Ext.A and
    > B) or submitted to CJK_C1 by IRG members, there are two ideographic
    > strokes are
    > found missing. Ideographic strokes are important for ideograph
    > decomposition,
    > analysis and for making ideographic strokes subset. Chinese linguists
    > suggest to
    > add these two ideographic strokes to CJK_C1.
    > </quote>
    > I also remember reading one WG2 document that explicitly raised the
    > question of
    > how to deal with all the ideographic strokes proposed in CJK-C that
    > are not
    > distinct ideographs in their own right, although I can't seem to
    > locate that
    > document any more.
    > All except one of the eight basic strokes mentioned at
    > <> are
    > *representable*
    > using existing characters in the CJK and/or Kangxi Radicals blocks :
    > dot = U+4E36 or U+2F02 [KANGXI RADICAL DOT]
    > dash = U+4E00 or U+2F00 [KANGXI RADICAL ONE]
    > perpendicular downstroke = U+4E28 or U+2F01 [KANGXI RADICAL LINE]
    > downstroke to the left or left-falling stroke = U+4E3F or U+2F03
    > SLASH]
    > wavelike stroke or right-falling stroke = U+4E40
    > hook = U+4E85 or U+2F05 [KANGXI RADICAL HOOK], as well as U+4E5A and
    > U+2010C
    > upstroke to the right =
    > bend or twist = U+4E5B and U+200CC
    > I concur with Ken that the 8x8 stroke categorization given at this web
    > site is
    > largely artificial. Whilst it may be useful to encode general
    > ideographic stroke
    > forms to help in the analysis and decomposition of ideographs, in my
    > opinion the
    > minute distinctions in the way that dots and dashes are written in
    > various
    > individual ideographs are beyond the scope of a character encoding
    > system as the
    > exact shape of a dot or length of a dash is irrelevant to any analysis
    > of the
    > compositional structure of an ideograph.
    > Andrew

    This archive was generated by hypermail 2.1.5 : Fri Feb 20 2004 - 11:28:42 EST