Re: Rare extinct latin letters

From: Philippe Verdy (
Date: Tue Jun 03 2003 - 13:12:10 EDT

  • Next message: Philippe Verdy: "Re: Rare extinct latin letters"

    From: "Kent Karlsson" <>
    > > Sorry, may be I was chosing the wrong diacritic (I was
    > > confused by its name, and I should have verified in the charts).
    > > Isn't U+0316 "COMBINING HORN" (combining class 216) what I
    > > wanted to use?
    > Let me cut my reply short: no.
    > ...
    > > script which already has a lot of them and creates
    > > difficulties for their correct placement in the combined
    > > glyph-cluster layout, notably because Unicode allows any
    > > combination of multiple diacritics on base characters, even
    > > though such combinations were never used in any past
    > > language, and will probably never be used for the case of
    > > multiple diacritics
    > There are several cases where two diacritics are applied to the
    > same instance of a base letter. Several of which are also encoded
    > in precomposed form. There are also cases, for Lithuanian at least,
    > but maybe also for other languages, where three diacritics are
    > applied to an instance of a base letter, none of these combinations
    > have a precomposed version as a single character.
    > ...
    > > combining classes, and not even with Korean which uses its
    > > own L+V*T* model,
    > L+V+T*. However, technically one *could* have used a base+
    > combining characters model also for Hangul.
    > ...
    > > So it seems legitimate to reuse existing diacritics allowing
    > > them to create new ligated forms that could be documented as
    > > specific to a rare language, and implemented most accurately
    > I would strongly recommend against that.

    Probably in the Unicode standard, but in practice, many users tha need to write a language will prefer to use such encoding which they think and agree to be better than nothing (PUA means nothing). Extinct forms of old languages need only be encoded by scholars, and they are the reference. If this use allows them to cooperate more easily, they will use it instead of using PUA with all the risks associated to it.

    All that is required is that someone in that group of scholars designs a font with slightly altered glyphs for use in that language (I'm not speaking here about Lithuanian, which is a separate language, and not Old French which is the reason why the request was made.)

    As far as I know, for Old French, there is asolutely no risk of collision with the use of the "same" diacritic in other languages, and whatever you think, using such encoding still requires defining an agreement.

    Who will regulate that: Unicode? Unicode has absolutely no right within private agreements that scholars may adopt together to encode a particular language. If they privately uses some fallback mechanisms to ease their own interchanges for this language, and avoid the constant maintenance cost for PUA reencodings, they can do it.

    If ever, later, Unicode consults these scholars (it should), it should just ask to them which abstraction they use, and propose an encoding in BMP or in SMP for the supplementary characters. The need for it will come if these scholar works need a broader diffusion. In the meantime, use of specific glyphs is simple to implement as a font designed specifically to display a modified form of the modern glyph currently represented in Unicode charts (which are NOT exhaustive for all glyph variants, notably for Brahmic scripts and Arabic, and this should be true also for diacritics, which are only a modern technical abstraction of the decomposition of base glyphs that languages consider as one character, and just used to match legacy encoding needs, but does not reflect the actual language usage)

    If Unicode takes some "liberties" with typical language usage, by unifying them quite agressively for technical reasons even if it does not match the semantic, why would non encoded languages be allowed to use such liberties for their needs? After all a A-WITH-HORN character does not exist in Old French, so it does not cause any conflict to encode it like this, considering that Unicode is just a technical constraint, and that such constraints requires some accomodation.

    As long as such accomodation is widely accepted by its users, as a good way to limit the impact of a strict modern technical specification, I see no problem in such usage, which best fits the semantics of the characters to represent. Only if there is public interest to revive these glyphs, it will be time to consider a normative representation in Unicode. But such modern usage would be completely unrelated with the usage in Old French as it is clearly not the same language! Scholars need then to consider if it is worth the value to reencode their legacy usage of diacritics in the context of Old French, according to the public need to reuse the same glyphs (not characters!) for a modern use.

    Show me a modern use of the exposed glyphs such as L-MOLL, and of course I will militate in favor of a distinct encoding in Unicode (may be y creating a new MOLL diacritic, or as precombined and undecomposable characters that represent the modern use of that glyph as an abstract character.

    Without such use, let some freedom to scholars, as their mutual agreements (and the fact that they are the only authorities for that language) is perfectly valid (Unicode prohibitions should only concern the case where it creates interoperability problems, but PUA will cause much more problems than a use of an approximate mapping to Unicode characters whose semantics best match the character to represent).

    This archive was generated by hypermail 2.1.5 : Tue Jun 03 2003 - 13:59:10 EDT