RE: FPDAM5: Egyptian hieroglyphs (was Re: Marks)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Sep 30 2007 - 07:12:35 CST

  • Next message: Philippe Verdy: "RE: FPDAM5: Egyptian hieroglyphs (was Re: Marks)"

    > -----Message d'origine-----
    > De : unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] De la
    > part de Serge Rosmorduc
    > Envoyé : dimanche 30 septembre 2007 12:58
    > À : verdy_p@wanadoo.fr
    > Cc : 'Unicode Mailing List'
    > Objet : Re: FPDAM5: Egyptian hieroglyphs (was Re: Marks)
    >
    > Philippe Verdy a écrit :
    > > Also I just wonder how the proposed encoding can be sufficient to
    > correctly
    > > encode any Hieroglyphic texts, given that it contains NO combining
    > > character, and no layout control characters for representing the quadrat
    > > layout.
    > >
    > >
    > It was decided to leave the sign layout outside of the character
    > encoding, as the simple combining characters you think of, if they give
    > an acceptable approximation,
    > are far from covering all possibilities. In particular, in some cases,
    > egyptologists will require exact positioning -- so, basically, one would
    > need to put into unicode a system which is quite clearly outside its
    > bounds.

    I can understand that there are some special applications that will need
    very precise position and glyph layout. But the same could be said to ALL
    scripts.

    The main problem I see is that ALL hieroglyphic texts, even the most basic
    ones where advanced positioning is needed, will require a specific renderer
    and, even worse, specific encoding conventions.

    My intent is not to represent the EXACT layout, but the composition
    relations that exist between each part. The most basic relation being first
    the quadrat grouping when such grouping exists, and then the "before" and
    "above" relations, which are not necessarily specifying the exact layout,
    but the most frequent way they semantic distinctions are made, for the most
    frequent use: the determinatives.

    A <h1, before, one> relation is semantically very different (meaning the
    figurated meaning of h1 in most frequent cases) from <h1, above, one>
    (meaning the logographic meaning of h1 in most frequent cases) and from <h1>
    (meaning the phonetic value of h1). If we just encode <h1> and <one> as
    symbols, absolutely NO text makes any sense because it is FULLY ambiguous.

    It's acceptable not to encode one of the relations, i.e. the simple
    juxtaposition without grouping ("-" in MdC notation), which is the most
    frequent case. But making all relations equal by not encoding them seems not
    reasonable.

    I wonder why there's absolutely no encoding proposed for treating the basic
    relations (":", "*" in MdC) as format controls, and modifications ("\" and
    possibly the rotations) as combining modifiers: this would be useful to keep
    the possibility of encoding hieroglyphs in plain-text.

    Otherwise, it's completely impossible to give any meaning to hieroglyphs in
    plain text. Even their assigned properties (gc=Lo) does not make sense given
    you can't make any meaningful words with them. They are in fact just treated
    like symbols. And the rationale about their encoding as letters (for
    allowing their use in identifiers without breaks between them) does not make
    sense either.

    The resulting situation would be much like if Hangul was encoded without
    encoding the distinctions between initial and final consonants (something
    that was attempted in the past then abandonned as it was not reliable), but
    here it is even worse as there's absolutely no way to determine any
    boundaries within the encoded hieroglyphic text. This makes the rendering
    completely impossible to perform, and the transport of texts with Unicode
    just illusory.

    If you need to encode these characters using an external encoding
    convention, then it will be much better to use MdC conventions with the
    existing Garland names... The Unicode encoding serves absolutely nothing. It
    does not help transforming the MdC encoding into plain-text, it just adds a
    complication for egyptologists.



    This archive was generated by hypermail 2.1.5 : Sun Sep 30 2007 - 07:17:30 CST