From: Peter Kirk (peter.r.kirk@ntlworld.com)
Date: Tue Jul 08 2003 - 05:23:59 EDT
On 07/07/2003 19:23, John Hudson wrote:
> At 08:51 07/07/2003, Ted Hopp wrote:
>
>
>> Editing would also be an
>> "interesting" experience. Could one search for lamed-patah and find
>> it as
>> part of lamed-<patah+hiriq>? Or would the proposal be to use these
>> new codes
>> only as part of bookend processing around normalization (i.e.,
>> automatically
>> recognize the sequences and substitute, normalize, and then
>> automatically
>> substitute back)?
>
>
> I suppose the latter is feasible. I am very keen that *any* solution
> should be invisible to the user.
Would it work to define a new character, for example, for patah-hiriq
which has a canonical decomposition into patah plus hiriq, or even into
hiriq plus patah? Would normalisation compose a patah-hiriq sequence
into this character and so get round the reordering problem? Remember
that the reverse sequence is actually not attested, as far as I can tell
for any of the sequences in question.
>
>> I think we need to keep Peter Constable's point in mind that current
>> usage
>> should not define the limits of Unicode functionality. Since the
>> principle
>> is that all sequences of character codes are permitted (2.10), it seems
>> wrong to supply a fix for only "the small number of attested sequences".
>
But I agree here. The kind of solution I have just proposed is in danger
of escalating in the way in which the number of Latin characters
escalated until a decision was made not to add any more.
-- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/
This archive was generated by hypermail 2.1.5 : Tue Jul 08 2003 - 06:10:04 EDT