From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Jul 30 2003 - 20:31:06 EDT
> But, as you, Ted, have said several times, we must
> support irregular spellings as well as regular ones.
Yes, of course, but there is a limit to how far this
desideratum can be carried forward in plain text. And
it tends to depend on the principles of the writing
system itself.
For an alphabet with no accents, it is easy. For English,
for example, just rearrange the letters into any old
irregular spelling you want: Leticia, Laticia, Latesha,
Lateesha, Letiesha, ... just keep on going until you
get tired.
For an system like Han ideographs, we are mostly talking
about a long history of accumulation of variant forms
of the characters themselves, and the problem is an
encoding conundrum that creates a problem of how many
spurious or unusual forms do you encode separately
as characters, which have come into the status of
validly separate characters, and which should just be
treated as glyphic variants of existing characters.
For an abjad historically written without vowels, but
with a long subsequent history of annotational
placement of vowel points, we are walking a fine line
here. One the one hand, you could take Jony's position,
which can be summed up as encode enough distinctions
to match current conventional usage, and use markup
for further distinctions. Or you could take the position
argued by Ted, Peter, and Joan, that the scholastic
texts they are dealing with have their own conventions
which make more distinctions required for *plain text*
representation. I think that in general the case that
Ted et al. are arguing makes sense for the Biblical
Hebrew texts, but at some point, the fine detail in
manuscript and typographical placement of annotational
dots goes beyond what is reasonable to represent in
a plain text encoding. As long as we are clear that there
*is* a line to be drawn here, we can continue to argue
just which side of the holam that line needs to be
drawn.
--Ken
This archive was generated by hypermail 2.1.5 : Wed Jul 30 2003 - 21:05:33 EDT