Re: U+0140

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Apr 15 2004 - 14:16:33 EDT

  • Next message: Patrick Andries: "Re: U+0140"

    From: "Patrick Andries" <Patrick.Andries@xcential.com>
    > Anto'nio Martins-Tuva'lkin a écrit :
    > >>However I advise removal of the note "Catalan" under U+0140 and
    > >>U+013F, and perhaps replacement of the whole note with «for Catalan
    > >>use U+006C U+00B7» (resp. U+004C).
    > >>
    > Did you get an answer on this ? Why is there no decomposition associated
    > to this character ?
    >
    > Also did somewhat mention why U+0140 is even in Unicode since it could
    > be considered (by ignorami like me) as a precomposed character (l +
    > middle dot) ? Is it due to the polysemy of the middle dot ?

    I thought it was already answered in this list by a Catalan speaking
    contributor: the sequence L+middle-dot in Catalan is NOT a combining sequence.
    The middle dot in Catalan plays a role similar to an hyphen between syllables,
    to mark a distinction with words where, for example a double-L would create an
    alternate reading. The dot indicates that each L must be read distinctly (or
    read with a long or emphatic L).

    In French for example we have words like "maille" to be read as /maj/, and the
    same "-ill-" written diphtongs after another vowel occur in Catalan. But French
    will not write "-ill-" if it occurs between two vowels where the two L must have
    the sound L (if this occurs in french, only 1 L is written, and the
    emphatic/long sound is not marked). Catalan has this orthograph, and writes the
    emphatic/long L distinctly. So it needs a symbol for that. The middle-dot is
    then considered in Catalan as a letter, that will occur in the middle of words.

    I don't know if the middle-dot can be used in Catalan as a cadidate position for
    a line break with hyphenation: if yes, is it kept before the hyphen, or is the
    middle-dot used alone, or is the middle-dot replaced by a regular hyphen? I
    don't know. But if the middle-dot must be replaced by a hyphen, then it is a
    punctuation (similar to hyphens used in compound-words).

    But in Catalan, the middle dot should not be kerned into the preceding uppercase
    L, like it would appear if it was considered equivalent to <L-middle-dot>.
    Catalan has no use of such decomposition, and if such decomposition had existed,
    it would have been into L + combining left-middle-dot, and not the same
    character.

    If there's something really missing for Catalan, it's a middle-dot letter with
    general category "Lo", and combining class 0 (i.e. NOT combining). It's
    unfortunate that almost all legacy Catalan text transcoded to Unicode are based
    on the middle-dot symbol (the one mapped in ISO-8859-1 and ISO-8859-15) which is
    not seen by Unicode as a letter (Lo) but as a symbol only.



    This archive was generated by hypermail 2.1.5 : Thu Apr 15 2004 - 14:52:03 EDT