Re: New contribution

From: Peter Kirk (
Date: Thu Apr 29 2004 - 17:33:22 EDT

  • Next message: Peter Kirk: "Re: Defined Private Use was: SSP default ignorable characters"

    On 29/04/2004 12:08, John Hudson wrote:

    > Peter Kirk wrote:
    >>> Peter, using a systematic transliteration between two structurally
    >>> identical scripts is not comparable to hack encodings.
    >> So, do you mean that the only reason hacked encodings for Greek or
    >> Cyrillic are unacceptable is that there a a few Greek or Cyrillic
    >> characters which do not have any direct Latin counterpart?
    > No, I mean that transliterating, i.e. using the characters of one
    > script to record text that might also or originally be written in
    > another script, is not the same as having those characters masquerade
    > as the other script. The fact that there is not always a one-to-one
    > match between scripts simply makes the hacks more numerous and
    > incompatible, since there is no systematic way to determine which
    > characters in the hack should masquarade as the extra characters in
    > the faked script. None of this is remotely similar to e.g. encoding
    > the same Sanskrit language text in the Devanagari or Tibetan scripts
    > according to the preference of the scholar/publisher.

    Ah well, we are talking about different things here. Transliteration of
    Greek or Cyrillic into Latin script is one thing. A quite different
    thing is encoding Greek or Cyrillic with Unicode characters defined as
    for Latin script but displaying these as Greek or Cyrillic with a
    masquerading font (your word, I think, John). The latter corresponds to
    what scholars of Phoenician etc currently do when they want to display
    or print out in Phoenician script (or whatever you may call it). If they
    continue to do so after a separate Unicode Phoenician script is defined,
    they will surely be going against what the standard expects them to do.

    > For the record, I have no objection to the encoding of the
    > 'Phoenician' script as proposed, although I think the proposer and the
    > UTC should consider a more generic name for the block that makes
    > clearer that this encoding unifies specific script variants, any and
    > all of which can also be encoded using Hebrew characters *if that is
    > your wish*. It seems to me that anyone bright enough to wrap his or
    > her head around Hebrew grammar should be able to handle this simple
    > concept without 'total confusion'.

    The confusion I am talking about is not of the scholars but of the
    software. Imagine what it could be like to search for a Phoenician text
    if some texts are encoded as Phoenician but others are encoded in Hebrew
    with a masquerading font. Still worse, imagine what happens if a text
    with a masquerading font is edited by someone using a true Phoenician
    keyboard setup, or perhaps vice versa. There is then a real danger of
    the two encodings being mixed in one document, which will make the text
    unsearchable. So surely Unicode does not want to encourage the
    continuing use of masquerading fonts alongside separate script encodings.

    Peter Kirk (personal) (work)

    This archive was generated by hypermail 2.1.5 : Thu Apr 29 2004 - 18:10:55 EDT