RE: FPDAM5: Egyptian hieroglyphs (was Re: Marks)

From: Philippe Verdy (
Date: Sun Sep 30 2007 - 03:33:35 CST

  • Next message: Serge Rosmorduc: "Re: FPDAM5: Egyptian hieroglyphs (was Re: Marks)"

    James Kass wrote:
    > In the absence of a devoted character for CHRISTIAN FISH SYMBOL,
    > if Christians who currently use the ASCII string "<><" instead adopted
    > the hieroglyphic pictogram of a fish as a convention for plain text
    > exchange, everyone would still get the idea.

    This could be true only if the Egyptian letter is not rendered the way it
    will behave with MdC conventions implemented in the renderer. If MdC ASCII
    separators (space, minus, underscore, equal sign, full stop, exclamation
    mark...), grouping operators (ampersand, colon, asterisk, parentheses...)
    and modifiers (backslash...) are present near the symbol, or with the
    mirroring behaviour of Egyptian letters are effectively applied, or if
    future control formats (replacing the non plain-text MdC protocol, such as
    implemented in the WikiHiero syntax) are added to the script, the symbol
    will not behave as expected.

    If hieroglyphs are intended to become a true script, instead of just a
    collection of symbols to be used within a layout syntax, such separate use
    as a symbol instead of a letter (as intended) will conflict.

    Anyway, the Hieroglyphs have the wrong properties, and the suggested
    hieroglyph is not representative, given that the hieroglyphs have much
    stronger glyphic requirements to allow the distinctions between the various
    Hieroglyphic fishes.

    Finally, the strong left-to-right directionality expected in the FPDAM5 and
    reflected in the choice of orientation in the proposed charts, will conflict
    with future actual rendering of Hieroglyphic texts, notably for monumental
    vertical rendering, where they are universally shown with the reversed
    direction (sighting pointing to the left, except in rare events where the
    same character is mirrored, like in some King cartouches).

    My opinion is that the directionality of hieroglyphs should be weak
    right-to-left, so that they are automatically mirrored when shown within
    Latin text. And a supplementary control format should be added to encode the
    explicit mirroring of a single hieroglyph when present (and the double
    encoding of L6 (U+131AA and U+131AB proposed) looks like an error: both
    characters are related to each other, only mirrored (something that may
    occur with many other hieroglyphs, even if this is in rare places like some
    cartouches; in MdC notation this is indicated by a "\" after the character
    notation or phonetic approximant based on the Grimal-Buurman 1988

    The MdC representation also makes use of a space or dash to separate
    quadrats, they won't be needed in a pure Unicode encoding as plain-text.

    For special layouts (that require grouping characters into unbreakable
    quadrats featuring superpositions in the horizontal layout, or horizontal
    alignment in the vertical layout), new things will be needed in the
    encoding, like control formats to replace the ":" MdC notation of
    superpositions, and the "*"MdC notation of horizontal groups that need to be
    superposed, and the "*()" notation for superpositions with a higher priority
    than ":". I'm not sure that we need to encode distinctly the "&" operator
    (that means "overlapping"). The "!" notation operator is clearly not needed
    for plain-text (we have newline controls for that).

    I'm not sure that the proposal should be sorted by the Gardner-based names
    adopted for Unicode names (notably because it mixes numbers all around the

    I'm also not convinced that the Egyptian nomes should be encoded only in
    their composed form (without possibility later to unify them with more nomes
    encoded through composition controls): this may be good only for numerals as
    they are a small well identified subset for which a decomposition would not
    be very helpful.

    Other things like smaller hieroglyph variants are probably not necessary to
    encode (in MdC they are not disunified, just modified by an operator acting
    like a combining modifier).

    Finally some quadrant grouping layouts are implicit from the letter form
    (this is the case for the singular digits) and could be inferred directly
    from the character properties (just like with Hangul layout or Arabic
    joining types).

    The quite frequent occurrence of the ideographic determinator (the vertical
    stroke below the hieroglyph) may militates for its encoding as a combining
    character instead of by a control format (for the vertical grouping): such
    disunification was proposed for noting the dual and plural separately from
    the cardinals 2 and 3 (and the ordinals 2 and 3 used in dates).

    Note for example that the plural form for meaning "men" shows a walking man
    with the plural mark noted on the LEFT (not below where it would denote "3
    mens" by counting them exactly instead of a generic plural) by the
    superposition of three occurrences of the small-numerals "1".

    The proposal contains clearly decomposable characters made of two instances
    of the same glyph simply juxtaposed (for example <hiero>i-i-</hiero>: their
    grouping seems to occur only because of some latin transliteration schemes
    that putatively suppose they were a single phoneme, something not needed for
    the rendering or interpretation of the encoded text.)

    The script has really some simple inner logic, that the proposal does not
    exhibit enough by encoding everything as unbreakable "Lo" characters. In
    most cases, the MdC composition operators could be avoided by:
    * inference from character properties
    * using explicit disjunctions between words if there's no visible space
    separator to be encoded
    * using format controls in every other remaining (rare) cases, like in the
    representation of the name "Osiris" where several alternate quadrant layouts
    are possible but ignorable (in collation), just like the optional mirroring
    found in some cartouches that could also be inferred (a property for persons
    or animals?)

    The proposal also forgets:
    * the representation of leading and trailing characters of cartouches as
    enclosing punctuations (which they are, surrounding words)
    * the null cartouche (with strokes above and below but no visible leading or
    starting glyph
    * the red color-coded areas in some documents, similar also to cartouche
    punctuations, and that could be alternatively rendered for example as a
    thick overline, but that is probably not a stylistic option.
    * the grayed areas for putative characters whose identification is not
    certain, similar to cartouches, but to be treated only as control formats as
    they don't imply a word break.
    * the grayed partial quadrants (combining modifiers?)
    * the 6 possible rotations described in MdC (in fact just 3 plus mirroring):
    combining modifiers?

    This archive was generated by hypermail 2.1.5 : Sun Sep 30 2007 - 03:39:29 CST