From: Peter Constable (email@example.com)
Date: Mon May 24 2004 - 21:10:12 CDT
> From: Philippe Verdy [mailto:firstname.lastname@example.org]
> Sent: Monday, May 24, 2004 3:28 PM
> Is it a joke? UTF-8 designates Unicode codepoints refering to
> Unicode abstract characters with all their semantic (including
> the character name and properties).
No, it is not a tweak. For years, many scholars working with electronic
versions of Biblical texts have used the MCW (not MCS -- a typo on my
part) representation, which is effectively a Latin cipher of Hebrew and
Greek characters. The abstract characters are entirely Basic Latin
characters, but they are standing for Hebrew or Greek characters.
> You can't say that the tableabove is ASCII not either Unicode.
> It's only a separate legacy 7-bit encoding.
It certainly could be considered ASCII or Unicode Basic Latin
characters: they are always documented as such, and viewed as such. One
*could* also consider it a legacy encoding of non-Latin characters, but
in practice it's not used that way -- it's only at a higher level of
interpretation (on the part of the user, not the system) that these are
Hebrew or Greek characters.
> which is probably
> not widely interoperable because unimplemented or not documented
> in the same common places as where ASCII and Unicode are defined.
Well, actually, it *is* interoperable within the sizeable community that
has adopted that convention -- they can and do interchange data using
this. You can find content using this representation in such places as
the Oxford Text Archive.
This archive was generated by hypermail 2.1.5 : Mon May 24 2004 - 21:10:52 CDT