From: Peter Kirk (peterkirk@qaya.org)
Date: Sun Apr 25 2004 - 16:03:29 EDT
On 24/04/2004 21:01, Ernest Cline wrote:
>... My point here was that adding a category of characters
>that was tightly bound to the preceding character without using the
>existing combining class mechanism would cause problems
>for normalization that could not be avoided, and as such, it is
>impossible to add variation selectors for combining marks
>unless the variation selector for a combining mark is of the
>same canonical combining class. That would cause any
>proposal for such variation selectors to have to add variation
>selectors for each canonical combining class, and thus
>increase the cost of implementing such a proposal.
>
>
Let us remember that problems arise with class 0 VSs only if preceded by
more than one combining mark. So it would be possible to specify that
VSs may be preceded by no more than one combining mark. Therefore, a
base character with two combining marks, one of which has a variant
glyph, must be encoded B CM2 VS CM1 - irrespective of the canonical
order. This is stable under normalisation as the VS is class 0. Even if
without the VS the canonical order is B CM1 CM2 (i.e. cc(CM2)>cc(CM1)),
the sequences B CM1 CM2 VS and its unnormalised canonical equivalent B
CM2 CM1 VS can be defined as illegal (just as at present any sequence of
CM VS is illegal), and the sequence with the variant glyph would have to
be B CM2 VS CM1. This avoids any problems with normalisation by defining
the sequences which can be reordered as illegal. (There is a small
problem when variants of BOTH combining marks are required, as B CM1 VS1
CM2 VS2 and B CM2 VS2 CM1 VS1 are equivalent but not canonically
equivalent. This could happen in Hebrew e.g. if a VS is used for dagesh
hazaq as well as qamats qatan, but should be rare enough to be a
marginal problem.)
>It might make sense to relax the restriction on allowable
>variation sequences to include combining marks of class 0,
>and maybe even to provide variation selectors for the two
>big classes of combing characters, 220 and 230, given
>that those two classes are far and away the largest non-0
>classes at present and are likely to remain so.
>
>
>
In principle this makes sense. In practice it fails to solve the
specific problem with Hebrew, because most of the combining marks which
have variants are not in classes 220 or 230.
Earlier, Ernest wrote:
>Adding Variation Selectors with non-zero canonical
>combining classes is possible, but I fail to see the benefits
>from adding new Variation Selectors on the SSP outweighing
>the benefits of defining new vowel marks in the Hebrew
>block.
>
The benefits of using variation selectors rather than new code points in
this case are exactly the same as those for variation selectors for base
characters, as expressed in TUS section 15.3:
> Occasionally the need arises in text processing to restrict or change
> the set of glyphs that are to be used to represent a character. ... In
> special circumstances, such a variation from the normal range of
> appearance needs to be expressed side-by-side in the same document in
> plain text contexts, where it is impossible or inconvenient to
> exchange formatted text. ... The variation selectors are used when
> characters have essentially the same semantic.
> Variation selectors provide a mechanism for specifying a restriction
> on the set of glyphs that are used to represent a particular
> character. They also provide a mechanism for specifying variants ...
> that have essentially the same semantic but substantially different
> ranges of glyphs.
I accept that there is some continuing debate (for which the Hebrew list
is the proper place) over whether the particular variant characters I
have in mind do "have essentially the same semantic". But in principle
these conditions may be true of combining characters just as much as of
base characters. And so the reasons for which VSs are defined for base
characters are just as valid for combining characters.
As for the new variant selectors being in the SSP, is this actually
necessary, or could they be in the Hebrew block, space permitting? After
all, if we are talking about VSs with the fixed combining classes of
Hebrew points, they are useful only with Hebrew script.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Sun Apr 25 2004 - 16:32:20 EDT