Re: Codes for Individual Chinese Brushstrokes

From: Kenneth Whistler (
Date: Thu Feb 19 2004 - 21:27:09 EST

  • Next message: "Arabic Presentation Forms A vs. B"

    Michael Everson asked:

    > At 14:14 -0800 2004-02-19, John Jenkins wrote:
    > >As a rule, no. Strokes are fragments of characters, not characters
    > >in their own right. There are some Chinese strokes encoded for
    > >various reasons, but there is no intention of ever providing an
    > >exhaustive catalog of strokes.
    > But of the 64 entities in that list, how many are encoded, and how
    > many are "standard" enough to merit consideration? I think that's
    > what the questioner was asking.

    Of the 64 entities listed on the page:

    *none* of them are encoded, and *none* of them are "standard"
    enough to merit consideration -- if by consideration you mean
    separate encoding as characters.

    If you read the page, you can see that it is arguing the case
    for a graphemic analysis which posits 8 basic strokes for
    Chinese characters, which then have a bunch of allographs.

    So, in our terminology, we are talking about allographic
    entities of glyphs, rather than abstract characters.

    And you should be very, very suspicious that there are exactly
    8 allographs listed for each of 8 basic stroke types. This is
    the kind of superstitious numerology that infests some kinds
    of traditional analyses. It just happens that '8' is a
    very lucky number in Chinese, does it?

    If you want to know how many stroke types there really are
    and how their forms are modified in context in various
    Chinese characters, you should consult with Tom Bishop and
    Richard Cook, who have an extensive catalog of basic stroke
    types and forms based on the usage of CDL in the Wenlin
    system for constructing Chinese character glyphs.

    *Any* proposal for encoding strokes for Chinese characters
    as characters in and of themselves would need to be based
    on an IT argument for their use as characters. (E.g. an
    information processing system that needed codes for strokes,
    in addition to codes for radicals and components, for
    discussing and constructing Chinese character forms. Wenlin
    is itself such a system, of course.) And a proposal
    would need a comprehensive modeling behind it to explain
    and justify the particular collection of strokes to encode.
    Appeals to printed tables of lists of these things from
    calligraphy sites in insufficient.

    The dots are an interesting case. There is really just one
    stroke here, but its exact shape is conditioned by its
    position and order in the drawing of a character. The details
    of the shape depend on position of the brush when it first
    touches the paper, whether the angle of the brush is moved
    during the dot, what angle the brush is pushed down at to
    make the body of the dot, how heavily the brush is pushed
    down, and whether the brush is then simply lifted back up
    from the heavy part of the dot, or whether the tip of the
    brush is trailed out of the dot as part of the trajectory
    to the next stroke. (In cursive styles, the tip may actually
    be dragged across the paper to the beginning of the next
    stroke, so in the third exemplar in the dot chart, you
    would drag the tip across from the leftmost dot until you
    reached the start point of the righmost dot, and then
    press down the heel of the brush to finish that dot.)

    This is the kind of detail that comes from calligraphy,
    and which influences Chinese character font design -- but
    all that level of detail is *not* what needs to be
    distinguished in coming up with an appropriate set of
    stroke primitives for representation of Chinese character
    structure in IT processing, for example.


    This archive was generated by hypermail 2.1.5 : Thu Feb 19 2004 - 22:04:11 EST