Re: Yerushala(y)im - or Biblical Hebrew

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Jul 08 2003 - 14:10:59 EDT

Next message: Peter Kirk: "Re: Yerushala(y)im - or Biblical Hebrew"

Previous message: Ted Hopp: "Re: Yerushala(y)im - or Biblical Hebrew"
In reply to: Peter Kirk: "Re: Yerushala(y)im - or Biblical Hebrew"
Next in thread: Peter Kirk: "Re: Yerushala(y)im - or Biblical Hebrew"
Reply: Peter Kirk: "Re: Yerushala(y)im - or Biblical Hebrew"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Tuesday, July 08, 2003 6:48 PM, Peter Kirk <peter.r.kirk@ntlworld.com> wrote:

> On 08/07/2003 09:16, Philippe Verdy wrote:
>
> > Even if listed in the Canonical Composition Exclusion list, this
> > would not work: this list only refers to characters that are
> > canonically decomposable into a character pair, and that MUST be
> > decomposed
> > and MUST NOT be recomposed when creating *either* a NFC or
> > NFD form.
>
> I am not trying to block reordering here. I accept that if the input
> data is patah - hiriq, this will (barring unacceptable changes to
> combining classes etc) always be normalised to hiriq - patah in both
> NFC and NFD. But normalisation forms don't specify rendering, and
> there are already well known exceptions to the general rule that the
> order of rendering follows the order of encoding. So all I am trying
> to suggest here is a way of specifying that the sequence hiriq -
> patah should be rendered as if it were patah - hiriq. Is there a way
> of doing that, without spilling too much sacred cow blood?

Admit that your proposal of using a canonical decomposition would
still cause problems with all Unicode algorithms, and with XML
processing.

Only a NFKD decomposition would make your proposed "ligature"
character workable for XML processing and Unicode algorithms,
including UCA, case mappings, UTF representations, etc...

But using a NFKD decomposition means that you create a new
character with its own identity, name, properties, set of glyphs,
UCA rules, mappings, etc...

Of course it would require a special keystroke sequence for
inputing it, but it's not impossible. At least this proposal avoids
the use of CGJ, and still allows an efficient rendering in fonts,
where it would be defined by combining two glyphs in the correct
order.

Would then the NFKD decomposition be safe to define, as it
would necessarily have to reverse the composed vowels inherently
part of the new character? It would also create possible confusion,
as it would be probably named "HEBREW LETTERS PATAH HIRIQ"
(and defined with which combining class value, the highest for PATAH
or the lowest for HIRIQ?), but its compatible decomposition would be
<compat> HIRIQ PATAH, to keep the requirements of NFC/NFD
stability...

-- Philippe.

Next message: Peter Kirk: "Re: Yerushala(y)im - or Biblical Hebrew"
Previous message: Ted Hopp: "Re: Yerushala(y)im - or Biblical Hebrew"
In reply to: Peter Kirk: "Re: Yerushala(y)im - or Biblical Hebrew"
Next in thread: Peter Kirk: "Re: Yerushala(y)im - or Biblical Hebrew"
Reply: Peter Kirk: "Re: Yerushala(y)im - or Biblical Hebrew"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Jul 08 2003 - 14:56:52 EDT