Re: Generic base characters - From Phetsarath Lao font

From: Asmus Freytag (
Date: Mon Jul 16 2007 - 14:14:51 CDT

  • Next message: Asmus Freytag: "Re: Generic base characters"

    On 7/16/2007 11:06 AM, John Hudson wrote:
    > Brian Wilson wrote:
    >> Why not have a section of 48 characters for generic bases. Encode
    >> the 10 characters that John Hudson recommends. All of the generic
    >> bases would be in one section of unicode and there would be plenty of
    >> room for expansion. That saves us ignorant people from wondering,
    >> "now which 'x-like' symbol do I use for Lao again"?
    > This is quite a good idea. It would enable layout engine developers to
    > make a safe assumption about the reserved codepoints, so that generic
    > base characters added to Unicode at a later date could be
    > automatically supported without needing updates to the layout engines.
    > And it would also remove any ambiguity as to what character codes
    > should be used to encode generic bases.
    Duplicating characters for which there are no visible differences is a
    bad idea. As many of the characters advocated would have duplicates, I
    would be surprised if the UTC went for such an approach. Having said
    that, I've argued here (and elsewhere) before that U+25CC is only an
    approximation for the correct generic dotted circle base character, and
    one has to push the glyph design fairly far away from the representative
    glyph to make it work. Coding an explicit 'dotted circle general base'
    would in principle not be problematic from the duplication point of view.

    However, it would be problematic from the point of view that there is so
    much (installed) base that uses 25CC. As for some of the other
    suggestions, the + sign is a really bad choice for duplication (unless
    one intends a 'plus-like' character that can / will be of different size
    and alignment than the standard PLUS SIGN). The reason is that + is on
    every keyboard, so unless the new character is a 'small plus' or
    otherwise differentiated in an obvious way, half the data will contain
    the wrong character.

    Reserving a range for future base characters might be useful, if there
    are a sufficient number of non-duplicated ones to start populating such
    a range now. But, I still think, this would be an uphill climb.

    A technical note that *lists* the known generic base characters,
    describes the issue, and gives guidance would be the most practical


    This archive was generated by hypermail 2.1.5 : Mon Jul 16 2007 - 14:16:18 CDT