Re: U+0140

From: Antoine Leca (Antoine10646@leca-marti.org)
Date: Fri Apr 16 2004 - 05:02:49 EDT

  • Next message: Ernest Cline: "Re: U+0140"

    On Friday, April 16, 2004 12:31 AM, Peter Kirk va escriure:

    >> Peter Kirk a écrit :
    >>
    >>> What is U+2027 intended for? The name suggests that it might be what
    >>> is needed for Catalan.
    >>
    >> Hyphenation point is primarily used to visibly indicate
    >> syllabification of words. Syllable breaks are potential line breaking
    >> opportunities in the middle of words. The hyphenation point It is
    >> mainly used in dictionaries and similar works. When an actual line
    >> break falls inside a word containing hyphenation point characters,
    >> the hyphenation point is rendered as a regular hyphen at the end of
    >> the line.
    >
    > Well, this sounds just like the required behaviour for Catalan, as
    > described by Anto'nio Martins-Tuva'lkin on 28th March. He wrote:
    >
    >> Something happends when the "L·L" coincides with a soft line end. I'm
    >> no expert in Catalan typesetting but IIRC the dot becomes a hyphen,
    >> while regular "LL"s cannot be broken.

    António is correct.
    But this is not the main point of ·. Main point for · is to disambiguate
    orthographies. Hyphenation behaviour is only a secondary role.

    Besides, it is vastly more easy to keep the obvious unification, rather than
    trying to distord it and trying to make a conditional mapping, if
    Mathematics, · => U+00B7, if Catalan, · => U+2027, if NoSeQue, · =>
    some_other_random_middle_dot, etc. Unlike hyphenation rules (where the
    mapping might very well be · => U+2027, by the way), which are pretty easy
    to pinpoint, tagging Catalan in bulk text is clearly not a easy task. Even
    when considering the fairly restrictive rules for it to occur (requiring
    NFC):
        /[aAàÀeEéÉèÈiIíÍïÏoOóÓòÒuUúÚ]l·l[aàeéèiíoóòuú]/
        /[AÀEÉÈIÍÏOÓÒUÚ]L·L[AÀEÉÈIÍOÓÒUÚ]/

    Antoine



    This archive was generated by hypermail 2.1.5 : Fri Apr 16 2004 - 07:38:22 EDT