Re: metric for block coverage

From: Philippe Verdy via Unicode <unicode_at_unicode.org>
Date: Mon, 19 Feb 2018 20:02:28 +0100

This pair of punctuation should have been considered since long as common
punctuations (independantly of their assigned names), i.e. assigned the
script property "Comn" and not "Deva". I don't see why they could not be
used in non-indic scripts (because they are not semantically equivalent to
Latin punctuations in their use).

I can easily imagine valid uses cases even in Latin, Greek or Cyrillic to
properly translate poems, religious texts, or citations without
transforming them in inaccurate full stops, colons, semi-colons, commas, or
even exclamation marks (such transform is an interpretation by the
translator), where they would typically be used along with surrounding
spaces and not glued to Latin/Greek/Cyrillic words. Such use in Latin would
be part of "extended Latin", but if these punctuations are "Common", this
is not so much extended, and many fonts could have these two simple
punctuations (which do not need any "complex" feature in OpenType).

Their presence in fonts designed for Indic scripts should be mandatory or
strongly recommanded (just like the mapping of SPACE, NBSP, dotted circle
or blank square, and a few others listed in OpenType development
documentation), meaning that given their "Common" script property we don't
need to test their presence to compute a script coverage (any other font
available could also be used by renderers to insert their own glyph if some
Indic fonts are ever defective for forgetting to map glyphs to them, just
like a renderer is allowed to substitute or infer a synthetized glyph for
the dotted circle or blank square, or any whitespace variant, if ever they
are not mapped, using only the basic font metrics to scale the glyph or
infer a suitable advance width/height; the renderer just needs to look at
the generic font metrics providing average width and heights and relative
position of the baselines in the em-square).

2018-02-19 15:58 GMT+01:00 Bobby de Vos via Unicode <unicode_at_unicode.org>:

> On 2018-02-18 12:10, Richard Wordingham via Unicode wrote:
>
> It's only a single bit without a meaning beyond "range is considered
> functional". No "basic coverage" vs "good coverage" vs "full
> coverage".
>
> It's worse than that when a script uses characters primarily
> associated with another script. For example, to have any confidence
> that my Tai Tham font will be used for U+0E4A THAI CHARACTER MAI
> TRI or U+0E4B THAI CHARACTER MAI CHATTAWA placed on U+1A4B TAI THAM
> LETTER A, I have to set the Thai bit, even though I only have four Thai
> characters in my font. (The other two are punctuation.)
>
>
>
> Indic scripts (other than Devanagari) also use a few characters from
> another block. Specifically, two punctuation characters (from the
> Devanagari block)
>
> - U+0964 DEVANAGARI DANDA
> - U+0965 DEVANAGARI DOUBLE DANDA
>
> are expected to be used with the non-Devanagari Indic scripts. Looking at
> the fonts Noto Sans Kannada and Noto Sans Tamil, the expected Unicode range
> bit is set for Kannada or Tamil, but not Devanagari, even though those
> fonts contain U+0964 and U+0965.
>
> Bobby
>
> --
> Bobby de Vos
> *bobby_devos_at_sil.org <bobby_devos_at_sil.org>*
>
Received on Mon Feb 19 2018 - 13:03:14 CST

This archive was generated by hypermail 2.2.0 : Mon Feb 19 2018 - 13:03:14 CST