A definition for orthograph ?

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Dec 25 2003 - 06:44:55 EST

Next message: John Jenkins: "Re: [hebrew] Re: Aramaic unification and information retrieval"

Previous message: Philippe Verdy: "Re: Aramaic unification and information retrieval"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

As one answer reveals to me that the term "orthograph" is apparently not
defined in Unicode, and this may create confusion with the term "character",
I'll try to define what I mean by this term, using carefully selected terms
(each one is important):

    An orthograph is
    an agreed convention
        between writers of a selected language
    to use a common set of glyphs
        recognized as equivalent in that language and
        creating classes of glyphs
            commonly refered to as "characters"
                by users of this convention,
    and to order these classes
    according to accepted "orthographic" rules
        (that try to match the language lexical
        and grammatical rules)
    in order to write words, sentences or whole texts
        that will be correctly understood by readers.

The term "recognized" is important here, as well as the limitation of the
term "equivalent". It supposes education and reading skills as they are
tought. It's a good justification for making Fraktur and modern Latin
letters separate as (despite they represent the same letters) they are not
recognized by users of the most common form of the language.

The term "class" above refers to wider subsets of glyphs than those that are
acceptable to represent Unicode characters. This is a place where the
"characters" defined in an orthograph are spanning distinct abstract
characters in Unicode. In that case, Unicode would create distinctions
between characters that do not exist in the origin orthograph, so that any
Unicode character may be equivalently acceptable to correctly represent the
word. Which abstract Unicode character is used is not relevant, so
recognizing which form is better is not an option, but this creates needs
for allowing "folding" rules or "decompositions", to restore the initial
distinctions and equivalences relevant for an orthograph (the written
language).

Next message: John Jenkins: "Re: [hebrew] Re: Aramaic unification and information retrieval"
Previous message: Philippe Verdy: "Re: Aramaic unification and information retrieval"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Dec 25 2003 - 07:39:47 EST