From: C J Fynn (cfynn@gmx.net)
Date: Thu Apr 29 2004 - 05:18:45 EDT
Dean Snyder <dean.snyder@jhu.edu> wrote:
> 1) You find an Iron Age text in Israel that exhibits characteristics of
> both Phoenician and Aramaic orthography.
> 2) The text shows possible Hebrew and Phoenician linguistic features, so
> you are not sure at all what language it represents.
> 3) How will you encode it, given you have at your disposal Hebrew,
> Phoenician, and Aramaic encodings?
If the scripts are as structurally near identical as it is claimed they are
then it should be straightforward to create a simlpe utility to transpose
between Hebrew, Phonecian, and Aramaic block encodings and/or a "smart" font
which can be used to display characters from one of these scripts with glyphs
from another.
(I notice Apple have a "Translitteration" feature for AAT fonts
http://developer.apple.com/fonts/Registry/#Type23 to switch display of text
between Hanja / Hangul, Hiragana / Katakana, Kana / Romanization,
Romanization / Hiragana, Romanization / Katakana. A feature like this could
always be extended to allow users to toggle between Phonecian/Hebrew display.)
It always going to be harder to disunify data at a later date than to unify it
since with plain, un-tagged text there is no indication of which script the
original text was written in, unless it is encoded with a seperate sub-set of
Unicode characters.
> 4) How will your possible miss-encoding affect future software results?
Why would this be a "miss-encoding"? I'd look at text encoded using
characters for particular scripts as being "finer grained" than where text of
different scripts is encoded using the characters for a single script. You can
always go from high resolution to low resolution but not the other way round.
> As the situation stands right now, one simply encodes it in Hebrew or
> Latin transliteration, effectively deferring further analysis to other
> processes. This has its benefits.
Having Phonecian characters in Unicode does not prevent anyone continuing to
use Hebrew or Latin translitteration but it does provide the option of using
Phonecian.
There is a somewhat similar dilemma with encoding Pali texts (e.g. the
Theravada Buddhist Cannon) - the same Pali Cannon is written in Sinhalese,
Burmese, Devanagari, Thai, Latin translitteration and several other scripts.
Sanskrit manuscripts can also be found in most of the Indic scripts - though
since Sanskrit is now predominantly written in Devanagari many choose to encode
Sanskrit texts in that script (or Latin translitteration) no matter what script
the manuscript of the text uses.
- Chris
This archive was generated by hypermail 2.1.5 : Thu Apr 29 2004 - 06:26:53 EDT