Re: Fallback Display for COENG (was: Re: Combining latin small letters with diacritics)

From: Leo Broukhis <leob_at_mailcom.com>
Date: Tue, 6 Mar 2012 16:25:56 -0800

Thank you, Ken!

What about Grapheme_Extend class characters placed out of context? It
would be nice to see a dotted box in cases like AׁB
(U+0041 U+05C1 HEBREW POINT SHIN DOT U+0042)

Leo

On 3/6/12, Ken Whistler <kenw_at_sybase.com> wrote:

>> I see. I was under an impression that the renderer must avoid
>> rendering such characters visibly if at all possible.
>
> Ah, a teachable moment!
>
> There is a distinction in the Unicode Standard between default ignorable
> code
> points and other characters, regarding the recommendations of the standard
> for fallback rendering.
>
> For default ignorable code points, the recommendation is, indeed, to just
> display nothing when your renderer cannot otherwise handle proper rendering
> of the character's intended effect. That is what you do, for example,
> with a ZWJ
> that is otherwise out of place or not supported for rendering in a
> particular
> context. (The exception would be for a Show Hidden mode, when you want
> to see *everything*.)
>
> For other characters, *including* viramas as a class, the fallback
> recommendation
> is to display something visible. Don't be fooled by the fact that the
> Khmer COENG
> is shown in the code charts with a dotted box and has no visible display
> of its
> own as a separate mark -- unlike typical Indic viramas. It is still
> better, in general,
> to know that a virama is present (or in this case a COENG) in text, even
> if you cannot
> display its intended effect properly if you stick it in the wrong sequences.
>
> For background on this topic, see Section 5.21, Default Ignorable Code
> Points,
> in the standard:
>
> http://www.unicode.org/versions/Unicode6.0.0/ch05.pdf
>
> For a complete list of default ignorable code points (which do not
> include U+17D2),
> see:
>
> http://www.unicode.org/Public/UNIDATA/DerivedCoreProperties.txt
>
> Down towards the bottom of that data file, you will also find a list of
> all the
> Grapheme_Link characters, which is identical to ccc=Virama, and
> constitutes the
> list of all the characters that are *structural* viramas in the
> standard, whether
> they are specifically termed a virama in a particular script or not.
> That list
> *does* include U+17D2. And none of the viramas is a default ignorable code
> point.
>
> --Ken
>
>
Received on Tue Mar 06 2012 - 18:28:04 CST

This archive was generated by hypermail 2.2.0 : Tue Mar 06 2012 - 18:28:04 CST