Re: N4106

From: <vanisaac_at_boil.afraid.org>
Date: Mon, 07 Nov 2011 00:34:19 -0800

From: Kent Karlsson <kent.karlsson14_at_telia.com>
Den 2011-11-05 04:23, skrev "António Martins-Tuválkin" <tuvalkin_at_gmail.com>:

> > I'm going through N4106 ( http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4106.pdf ),
> ...
>
> I see the following characters being put forward for proposing to be
> encoded:
>
> 1ABB COMBINING PARENTHESES ABOVE
> 1ABC COMBINING DOUBLE PARENTHESES ABOVE
> 1ABD COMBINING PARENTHESES BELOW
> 1ABE COMBINING PARENTHESES OVERLAY
>
> Well, COMBINING DOUBLE PARENTHESES ABOVE seems to be the same as <COMBINING
> PARENTHESES
> ABOVE, COMBINING PARENTHESES ABOVE>. And COMBINING PARENTHESES OVERLAY seems
> to be just
> a tiny parenthesis before and a tiny parenthesis after; no need for a
> combining mark, especially one with
> a splitting behaviour.
>
> Otherwise, I think COMBINING ((DOUBLE)) PARENTHESES ABOVE/BELOW are an
> entirely new brand of
> characters in Unicode (if accepted as proposed). They are supposed to split
> (ok, we have split
> vowels in some Indic scripts, more on that below), but these split around
> *another combining mark*.
> So despite being given (as proposed) vanilla above/below mark properties,
> they do not "stack" the
> way such characters normally do, but is supposed to invoke an entirely new
> behaviour.

I agree, except that if we give them any but a ccc=220/230, then canonical
reordering will separate them from the modifier letters that they are attached
to. I think this is one of those cases where a definition needs to expand in
order to accommodate architecture. We do already have some non-stacking
behaviour defined for these characters in order to accommodate polytonic Greek,
so we do have some experience with disparate appearances of consecutive marks.

> That supposedly stacking combining marks *sometimes* (more a font dependence
> than a character
> dependence) don't stack but instead are laid out linearly is not new. But to
> *require* non-stacking
> behaviour for certain characters is new.

Then think of it as the "non-spacing" version of stacking behaviour.

> So we have a combination of:
>
> 1. Splitting. (Normally only used for some Indic scripts).
>
> 2. Indeed splitting with no other characters to use for the decomposition,
> thus requiring the use of
> PUA characters, to stay compliant, for representing the result of the
> split at the character level.
> (This is entirely new, as far as I can tell.)

I cannot imagine in any way how this requires PUA characters.

> 3. The split is entirely *within* the sequence of combining characters
> (except for COMBINING
> PARENTHESES OVERLAY, which behaves as split vowels normally do, but still
> with issue 2), not
> around the combining sequence including the base. (This is entirely new.)
>
> 4. Requiring (if at all supported) to use linear layout of combining
> characters instead of stacking.
> (This is entirely new.)

If I were designing a font, I would simply make the in/out mark attachment
point near the top/middle of the parentheses, so that it drops down around the
"base" mark, and then attaches any subsequent marks as if the parentheses
weren't there. I think you're making this too complicated.

> This makes these proposed characters entirely unique in their display
> behaviour, IMO.

I do, however, agree totally with this assessment, I just believe it is more
manageable than you paint it.

[snip]
> /Kent K

I do, myself, have a couple of concerns in regards to several proposed
characters in N4106 as well. Namely, I believe that U+1DF2, U+1DF3, and U+1DF4
should require significant justification as to why they should not be encoded
as U+0363 + U+0308, U+0366 + U+0308, and U+0367 + U+0308. I have similar
concerns about U+A799, U+AB30, U+AB33, U+AB38, U+AB3E, U+AB3F, etc.

Van A
Received on Mon Nov 07 2011 - 02:37:12 CST

This archive was generated by hypermail 2.2.0 : Mon Nov 07 2011 - 02:37:18 CST