Transliterating ancient scripts [was: ASCII and Unicode lifespan]

From: Nick Nicholas (
Date: Sat May 21 2005 - 22:52:58 CDT

  • Next message: Michael Everson: "Phaistos"

    from Dean Snyder:
    > >Like I say on
    > >unicode_epichorica.html : "Don't Proliferate, Transliterate".
    > Encoding existing scripts is not proliferation.
    But as I argue on my page, running the risk of violating the
    character-glyph model is proliferation. And the requisite
    normalisation is almost always done through transliteration, rather
    than normalising the original scripts.
    > Transliteration is lossy.
    In the sense that any normalisation of a glyph repertoire, necessary
    for the character-glyph model, is.
    > And I notice you did not quote or respond to my concluding,
    > pertinent-to-
    > your-comments, remark: "Cuneiform in transliteration is like
    > Japanese in
    > transliteration, with all the same advantages and disadvantages."
    > If you want to oppose encoding scripts you will have to deal with such
    > observations.
    It would help me deal with them if they were less gnomic. :-1/2. The
    comparison is unfair, at least because Romaji does not make a
    distinction between kanji, hiragana and katakana, whereas the
    transliterations of Hittite do distinguish between Sumerian
    logographs, Akkadian logographs, and the syllabary.
    > >(As
    > >Patrick just said, and Carl-Martin Bunz insisted in Unicode tech note
    > >3). Unicode may contain a whole heap of archaic scripts, but that
    > >will not change the fact that old texts will overwhelmingly continue
    > >to be published and discussed in transliteration
    > Not a little due to inadequate technology.
    > But contra, see Syriac, Greek, Hebrew, Old Chinese ...
    I'm not convinced. There was no technological difficulty involved in
    using Syriac but not Meroitic script; the Syriac liturgical press was
    not such a boon area to make it the obvious technological choice. As
    I argue, it was familiarity: Syriac was already available in print,
    Meroitic wasn't.
    > You lost your case when you completely misrepresented the arguments
    > used
    > against encoding Phoenician:
    I don't think I lost it, no. Why the Hebrew square script variant of
    the script, and not Samaritan -- or Syriac, for that matter, whose
    consonants are close enough? I'm happy to rephrase the hot-button
    bits, but the choice of the Hebrew variant was certainly not
    innocent, as I go on to say. (Coptic and Greek are the same script
    too as far as I'm concerned, and the disjunction there wasn't
    innocent either: Unicode was swayed by the glyphic difference, but
    the glyphic difference wouldn't have happened in the first place if
    not for the cultural divergence -- and if Fraktur or Gaelic script
    survived in contemporary widespread use, the difference with them
    would only be one of degree.)

    In any case, that section of my webpage isn't talking about why not
    to encode separate scripts, but what the choice of script (or if you
    must, script variant) to publish in is dictated by.
    > Anyone who does original research in cuneiform knows the pedagogical
    > value of glyph reinforcement,
    arguably not plaintext
    > the analytical value of glyph interaction,
    not plaintext
    > the value of programmatic script detection for text processing,
    assuredly not plaintext
    > and the
    > importance of both glyph-based restorations and glyph-based error
    > detection.
    not plaintext -- if you're discussing restorations and errors, you're
    staying close to the facsimile anyway.

    The acid test is, do publications of cuneiform routinely contain
    slabs of normalised printed cuneiform as text, without
    transliterations as a crutch? If they do, I rescind, because that
    *is* plaintext. (The point certainly hold for most other ancient
    scripts, though, such as Gothic.)

    I'm not going to stop ancient script stuff being encoded, and it's
    not like we're running out of codepoints. But there is a distinct
    lack of enthusiasm by a lot of scholars to normalising ancient script
    glyph repertoires (see anything the DIN has had to say in ISO review
    of hieroglyphics :-) , so Tom Emerson was right in the first instance
    to say that "very little need" is not the criterion Unicode uses to
    propose scripts (though it's certainly a legitimate criterion for
    according priority). The Phaistos Disc (or board game) is still a bad
    idea to encode as a matter of principle, though.

    How can the king and nobles make ends meet,    Dr Nick Nicholas,
    if not by eating you and all the others?    French Italian Spanish,
    (Cheetah to Ox; _Tale of the Quadrupeds_,   University of Melbourne
      Byzantium, 14thC)

    This archive was generated by hypermail 2.1.5 : Sat May 21 2005 - 22:54:22 CDT