Re: Writing a proposal for an unusual script: SignWriting

From: vanisaac@boil.afraid.org
Date: Sun Jun 13 2010 - 20:41:40 CDT

  • Next message: Doug Ewell: "Re: Writing a proposal for an unusual script: SignWriting"

    From: Stephen Slevinski (slevinski@gmail.com)
    > Hi Van,
    >
    > I think the paradigm shift needed here is that SignWriting is not
    > HamNoSys. We write symbols is space. The writer decides what symbols
    > to use and where they go. The script does not include the semantic
    > information you want it to include because it is not part of the writing
    > system.

    Are you saying there is no meaning attached to placement in signwriting?

    > The font designer has very limited choices regarding glyph size
    > and shape. This is one of the rules that SignWriting knowingly breaks.

    How? What does it mean when you have a larger or smaller hand sign? Why do you assume that a font designer can't have differently sized and shaped modifications of a particular glyph? If it encodes semantic (in the broad sense) information, then any implementation MUST allow for and represent this information. If it the distinction is purely a matter of personal preference, then a character encoding model can not, should not, and will not encode that information.

    > Many sign language avatars are using HamNoSys as a base notation to
    > generate their signing animations. I'm sure someone could use HamNoSys
    > to create SignWriting images and give the font designer much more freedom.
    >
    > The point of my proposal will be to encode how we write today. When
    > people write, they choose symbols and placements. They do not encode
    > semantic language information.

    Are you saying that signwriting does not represent the semantics of sign languages? Are you saying that users place the elements of a sign haphazardly with no meaning attached to the placement whatsoever?

    > The spatial relation between the symbols
    > is meaningful to the writer.

    Does it have meaning to anyone else? If it doesn't, it is really just a personal code, and doesn't belong in Unicode. If it has meaning for someone else who would read it, what is that meaning? If this information is to be transmitted and read by another person, there has to be some meaning that is intended by the writer and is understood by the reader. WHAT IS THAT MEANING?

    > You can decide that it is a poor choice for
    > a writing system, but it is the writing system.

    What you are describing is not a writing system; but I don't think you actually mean what you seem to be saying. A writing system has definable semantics. An individual element of a writing system has an understood meaning. There may be some individuals who interpret the rules slightly differently, but they all are trying to fundamentally place on paper/computer something more ephemeral: a permanent record of spoken or signed language.

    > > What is the salience of a hand being 5 cooordinate points low of center vs.
    > 10? If two people encoded a particular sign, would they necessarily use the
    > same coordinates? How do I search for a given sign when the coordinates can be
    > different? Does the /search/ algorithm need to know that there is about 10
    > coordinate points variation for a hand touching the side of the torso, but only
    > 2 when it's touching a part of the face (lips/chin/cheek)? If your coordinate
    > system has to incorporate "fine" adjustments to look right, how can I do a
    > search of the dictionary to find all of the signs where the eyes are closed and
    > the hand or fingers brush the chin?
    > Eyes closed can be searched for as BaseSymbol 535. A single brush
    > contact can be searched for as BaseSymbol 526. Fingers brush chin is
    > not part of the writing system. HamNoSys has this type of information,
    > but this is not part of SignWriting.

    And English writing doesn't include intonation. It was an example. Signwriting /does/, however, represent signed language. It has rules for doing so. THAT is what needs to be codified in a character encoding model.

    > You could further limit the search if you included the appropriate
    > handshape. As an example, let's use BaseSymbol 22: Index and middle
    > fingers together as a unit.
    >
    > Searching for a sign with these 3 BaseSymbols would be the initial
    > search. Next, we'd need to analyze the symbol placement. We'd be
    > looking for the brush and handshape to be slightly below the center of
    > the sign. Our search results would sort by a percentage match.

    Then it is not a plain text search algorithm. A character search provides /matches/ only. This is a binary yes/no. Either this word/sign matches or it doesn't. There are no percentages.

    > > If you can define any element as having any coordinate, how do you
    > normalize text when someone defines the left hand before the right?

    > Text normalization can occur as the text is written or after. While
    > writing, a dictionary can be accessed and previously made signs can be
    > used.

    That's not a normalization algorithm. A normalization algorithm will take any sign created - whether you can find it in a dictionary or not - and put it in a standardized order. Take a look at the Unicode Canonical Ordering Behavior. It takes diacritic marks and rearranges them so you can search for matching characters. If they didn't reorganize, then you would have two different sequences that each represented, for example, a latin small letter f with an acute above and a dot below: 'f' + acute + under_dot, OR 'f' + under_dot + acute. As it turns out, the second is correct, and the Canonical Ordering Algorithm would automatically rearrange the first, no dictionary involved.

    > After the writing, a dictionary can be accessed and signs can be
    > searched and matched. If two signs are close enough, the sign in the
    > text can be replaced by the sign in the dictionary. If the two signs
    > are distinct enough, the sign in the text can be marked as a possible
    > variant to the sign in the dictionary.
    >
    >
    > > These are basic text tasks that the coordinate system makes insanely > complex.
    > >
    > >
    > A unique challenge yes, but a unique writing system.
    >
    > > Element: [HandR], [HandL]. Position Modifiers: [HeadTop], [HeadCheek], > [HeadOppositeCheek], [HeadChin], [HeadNose], [HeadTemple], [BodyHigh], > [BodyLow], [BodyCenterHigh], [BodyCenterLow], [BodyCenterMid], [BodyWideHigh],> [BodyWideLow], [BodyWideMid], Default: [BodyMid].
    > I see HamNoSys, not SignWriting. That's not how we write.

    So how do you write? Is there no meaning attached to element placement? If not, then placement, by definition, does not matter. As such, it wouldn't belong in a character encoding model. I don't think that's the case. You know the writing system, I don't. What meaning is attached to placement? That MEANING must be encoded!

    > > A character encoding does not define them by their precise position, it
    > specifies a /meaning/. The font designer has to figure out exactly how best to
    > graphically represent that.

    > A character encoding pairs a character from a given repertoire with a
    > code point in order to facilitate transmission and storage.

    No. A character encoding represents all semantically relevant information in a writing system to enable the transmission and storage of that writing system. It defines the representation of salient, semantic meaning represented by a writing system.

    > The current way that we write puts all of the flexibility to the
    > writer. The spatial relation between the symbols is meaningful and was
    > specifically chosen by the writer. The font designer does not get to
    > decide how to position the symbols.

    What is the meaning? That's the only information that is germaine to a character encoding. The vagueries of individual usage are actually irrelevant to a character encoding.

    To the contrary, the font designer is the only one who can decide how to position the symbols to impart that information. From your perspective, the font designer needs to be the writer, not the end user. (S)he will hopefully understand the system and the representative glyph images, and then interpret it to their design goals, enabling the end user to write to their satisfaction. That's what makes it a character encoding: different designers will interpret it to meet their needs and the needs of their users. They design a specific font to meet those needs. Not only do you not get to tell them exactly how to do that, but you have to enable them to interpret the writing system however they want to.

    > In the future, when we have a large corpus of independently verified
    > excellent writing, we can analyze that writing to see if we can create
    > rules of attachment that can reproduce the writing without coordinates.

    It actually should NOT reproduce the writing, it should only /represent/ the writing.

    > That corpus will exist in the future and it will be encoded in Binary
    > SignWriting with the ISWA 2010. However, the problems of encoding
    > SignWriting for Unicode goes beyond the coordinate based system all the
    > way to the alphabet. To properly follow all of the rules, the alphabet
    > itself will need to be analyzed and refactored.
    >
    > The length and breadth of trying to apply all of the rules of Unicode to
    > SignWriting would require starting from ground zero and rebuilding the
    > whole system. This work will take years and may or may not produce a
    > usable system.

    I think, by carefully analysing the system right now, a character encoding can be created.

    > I believe a slightly modified version of Binary SignWriting will add
    > tremendous value to Unicode. It only makes sense for the universal
    > character set to be able to encode the sign languages of the world. If
    > Unicode includes simplified and traditional Chinese, why can't it
    > include Binary SignWriting and Normalized SignWriting?

    I believe it is essential for Unicode to encode Sutton SignWriting. I'm not sure what the distinction is between Binary and Normalized SignWriting, but if the difference is between representing all variations of individual usage and only representing the abstract meaning behind the writing system, then the distinction is quite clear: Binary SignWriting is a glyph encoding model, Normalized SignWriting is a character encoding model. One of these belongs in Unicode, and the other just plain doesn't. Glyphs are the responsibility of the font designer and only the font designer, who will execute the character repertoire as glyphs.

    > Regards,
    > -Steve

    My best,

    Van



    This archive was generated by hypermail 2.1.5 : Sun Jun 13 2010 - 20:43:49 CDT