RE: MCW encoding of Hebrew (was RE: Response to Everson Ph and why Jun 7? fervor)

From: Peter Constable (petercon@microsoft.com)
Date: Mon May 24 2004 - 21:10:12 CDT

  • Next message: John Hudson: "Re: Proposal to encode dominoes and other game symbols"

    > From: Philippe Verdy [mailto:verdy_p@wanadoo.fr]
    > Sent: Monday, May 24, 2004 3:28 PM

    > Is it a joke? UTF-8 designates Unicode codepoints refering to
    > Unicode abstract characters with all their semantic (including
    > the character name and properties).

    No, it is not a tweak. For years, many scholars working with electronic
    versions of Biblical texts have used the MCW (not MCS -- a typo on my
    part) representation, which is effectively a Latin cipher of Hebrew and
    Greek characters. The abstract characters are entirely Basic Latin
    characters, but they are standing for Hebrew or Greek characters.

    > You can't say that the tableabove is ASCII not either Unicode.
    > It's only a separate legacy 7-bit encoding.

    It certainly could be considered ASCII or Unicode Basic Latin
    characters: they are always documented as such, and viewed as such. One
    *could* also consider it a legacy encoding of non-Latin characters, but
    in practice it's not used that way -- it's only at a higher level of
    interpretation (on the part of the user, not the system) that these are
    Hebrew or Greek characters.

    > which is probably
    > not widely interoperable because unimplemented or not documented
    > in the same common places as where ASCII and Unicode are defined.

    Well, actually, it *is* interoperable within the sizeable community that
    has adopted that convention -- they can and do interchange data using
    this. You can find content using this representation in such places as
    the Oxford Text Archive.

    Peter Constable



    This archive was generated by hypermail 2.1.5 : Mon May 24 2004 - 21:10:52 CDT