Re: How to Add Beams to Notes from Philippe Verdy via Unicode on 2017-05-03 (Unicode Mail List Archive)

From: Philippe Verdy via Unicode <unicode_at_unicode.org>
Date: Thu, 4 May 2017 05:01:17 +0200

2017-05-03 9:49 GMT+02:00 Richard Wordingham via Unicode <
unicode_at_unicode.org>:

> On Tue, 2 May 2017 05:08:27 +0200
> Philippe Verdy via Unicode <unicode_at_unicode.org> wrote:
>
> > Consider also that the BMP is almost full, the remaining few holes
> > are kept for isolated characters that may be added to existing
> > scripts, or permanently reserved to avoid clashes with legacy
> > softwares using simple code remappings between distinct blocks, or to
> > perform simple case conversions (e.g. in Greek) for internal purposes
> > (these positions are not interoperable and may clash with future
> > versions of the UCS and I18n tools/libraries like ICU)
> >
> > You should abstain using any currently unassigned positions in the
> > existing Unicode blocks: use PUA if you have nothing else; there are
> > plenty of space available, in the BMP (most common usage in fonts
> > that need to map additional glyphs) or in the two last planes.
>
> It isn't codepoints that is the constraint; one must consider the
> number of glyphs without dedicated one-character codes.
>

Glyph processing use requires internal glyph ids in fonts. The limit is on
the total number of glyphs you can put it that font without exceeding the
maximum size of glyph id's. Traditionally this is solved by creating
coherent (but complete enough) subsets so that all glyphs within the same
script can fit. The other solution, nobaly for sinograms, is to use font
linking

The Arabic script (and other cursively connected scripts) has similar
> expansions, even if one goes for a typewritten style.
>
> Devanagari explodes when one considers just the conjuncts prescribed for
> Hindi.
>

Rendering Devanagari with OpenType does not require any PUA assignment in
that font for variants. The sequences are mapped directly using subtables
and the rules defined in OpenType for that script. Fonts just use their own
internal glyph ID's without having to assign them any Unicode mapping,
using Glyph processing rules.

Same remark about Arabic (though some encoded compatibility characters will
map to some of these glyphs... without using any PUA).

>
> I think it's also necessary to avoid splitting likely grapheme
> clusters between fonts. Which of the three fonts will support U+1F3F4
> U+E0067 U+E0062 U+E0065 U+E006E U+E0067 U+E007F (English flag) and
> which U+261D U+1F3FF (index pointing up: dark skin tone)?
>
> Now, the BMP has headroom provided by the surrogate characters and the
> PUA, which will not have mappings, but I'm not sure that it's enough.
>
>
For your question, the solution is to create corent subsets of symbols and
create fonts from this subset. For the case of country/region flags, they
could all be separated in a specific font. As well you can create separate
fonts for persons/animals/plants, and another one for unanimated objects
(including planets, game pieces...) Traditional punctuation-like symbols
used in typography and normally without any emoji style can fit a generic
symbols fonts (along with geométric shapes, line drawing symbols).
Received on Wed May 03 2017 - 22:03:02 CDT

This archive was generated by hypermail 2.2.0 : Wed May 03 2017 - 22:03:03 CDT