Re: Draft Proposal to add Variation Sequences for Latin and A Cyrillic letters

From: verdy_p (
Date: Sat Aug 07 2010 - 15:32:45 CDT

  • Next message: Doug Ewell: "Re: Draft Proposal to add Variation Sequences for Latin and=D=A Cyrillic letters"

    "Michael Everson"
    > On 6 Aug 2010, at 22:20, Karl Pentzlin wrote:
    > > Am Dienstag, 3. August 2010 um 09:45 schrieb Michael Everson:
    > >
    > > ME> ... In particular the implications
    > > ME> for Serbian orthography would be most unwelcome.
    > >
    > > As I have outlined in the revised introduction of my proposal,
    > > there are *no* implications for Serbian orthography.
    > > Admittedly, this was a little bit implicit in my first draft.
    > Yeah, well, I am not convinced of the merits of your proposal. Sorry.

    I am not convinced too. Because all what this proposal is supposed to solve is to allow an automted change of
    orthography so that SOME long s in old doucments using Fraktur style will become round s in some other antermediate
    style (like Antiqua) and then all of them will become "round" s later.

    It's a matter of orthographic adaptation, i.e. modernization of old texts. But any modernization of old
    orthographies imples more than just changing some glyphs. For example the modernisation of medieval French texts
    require knowing when it was written (to correctly infer its semantic), then knowing for which period of time the
    modernized version was created, and then knowing what other orthographic changes where necessary, such as
    substituting "s" (long or round) into circumflexes, or changing tildes into circumflex or newer (distinguished)
    modern accents, or dropping some other letters.

    Unicode is not made to adapt to orthographic changes. My opinion is that it just has to encode the orthography, AS
    IT IS, ignoring all possible other adaptation due to modernizations (and evolutions of the written language).

    In other words, the existing "long s" and common "round s" is just enoiugh to preserve the original orthography and
    its semantics, as they were in the original text (even if it was ambiguous or incoherent). The variation selectors
    are not intended to convey the additional semantics needed for adaptations to newer orthographies, but ONLY the
    additional semantics that exist in a written language at the time when it was effectively written.

    Text modernizers will really need something else, notably lexical and gramatical analysis (within humane
    supervision), and they are completely out of scope of Unicode and ISO 10646. These will work by effectively
    correcting the text, i.e. changing its original orthography and semantics. This process will be mostly like many
    transliterations schemes or like all translations processes: the resulting text is obsiously different and intended
    for different readers.

    The only case where we really need variation selectors is when we can demonstrate that there are opposable pairs
    where a glyphic variant (within a unified abstract character) in the SAME text by the SAME author conveys a distinct
    semantic. For everything else, variations selectors should not be used at all, and a encoded "round s" will still
    mean the same, even if it's renderered with a Fraktur font or a Bodoni- or Antiqua- like font.


    This archive was generated by hypermail 2.1.5 : Sat Aug 07 2010 - 15:37:12 CDT