From: Ted Hopp (ted@newslate.com)
Date: Mon Jul 07 2003 - 11:51:08 EDT
On 07/07/2003 8:52 AM, Peter Kirk wrote:
> On 06/07/2003 17:22, John Hudson wrote:
> > ... Given the small number of attested sequences that would be
> > adversely affected by normalisation re-ordering, I'm beginning to
> > favour the idea of encoding these sequences as individual characters.
> > We'd probably only need three or four, plus a right meteg, to solve
> > the problem, and rendering would work find with existing font and
> > layout engine technologies.
>
> This sounds like a sensible alternative.
This would make data entry difficult for users. Nobody thinks of these
character sequences as single characters. Editing would also be an
"interesting" experience. Could one search for lamed-patah and find it as
part of lamed-<patah+hiriq>? Or would the proposal be to use these new codes
only as part of bookend processing around normalization (i.e., automatically
recognize the sequences and substitute, normalize, and then automatically
substitute back)?
I think we need to keep Peter Constable's point in mind that current usage
should not define the limits of Unicode functionality. Since the principle
is that all sequences of character codes are permitted (2.10), it seems
wrong to supply a fix for only "the small number of attested sequences". In
view of this principle, the current combining class values are at odds with
definition D46 (combining class; section 3.11) as well as with the
discussion in 2.10 on multiple combining characters. That is what should be
fixed.
Ted
Ted Hopp, Ph.D.
ZigZag, Inc.
ted@newSLATE.com
+1-301-990-7453
newSLATE is your personal learning workspace
...on the web at http://www.newSLATE.com/
This archive was generated by hypermail 2.1.5 : Mon Jul 07 2003 - 12:43:26 EDT