Re: U+0140

From: Kenneth Whistler (kenw@sybase.com)
Date: Thu Apr 15 2004 - 20:20:51 EDT

  • Next message: Philippe Verdy: "Re: U+0140 Catalan middle-dot"

    > >00B7;MIDDLE DOT;Po;0;ON;;;;;N;;;;;
    > >10101;AEGEAN WORD SEPARATOR DOT;Po;0;ON;;;;;N;;;;;
    > >16EB;RUNIC SINGLE PUNCTUATION;Po;0;L;;;;;N;;;;;

    > I was meaning to ask about this. I'm all over not encoding Yet Another
    > middle dot, but I was wondering. In my research on Samaritan, I've
    > found that they frequently write (you guessed it) a middle dot to
    > separate words (they like to use space to enable them to do this cool
    > columnar writing thing). I was assuming that this could be conflated
    > with someone else's middle-dot-word-separator; would that be U+10101?

    As far as I am concerned, U+00B7 should be sufficient for that.

    But if you were looking for a punctuation mark distinguished from
    U+00B7, specifically for archaic textual practice, my choice
    would be U+16EB (and the Runic double dot, U+16EC) as an
    alternative. Scripts.txt treats these as common punctuation:

    16EB..16ED ; Common # Po [3] RUNIC SINGLE PUNCTUATION..RUNIC CROSS PUNCTUATION

    Unfortunately, software may be making over-aggressive assumptions
    about script identity in some cases, which can throw off
    implementations that pick up punctuation out of another script
    block.

    Note that as part of the ongoing work to cover Greek paleographic
    needs, a large number of multiple dot punctuation characters are
    currently under ballot for addition to 10646 (and Unicode). See
    2056, 2058..205E at:

    http://www.unicode.org/alloc/Pipeline.html

    These are (proposed to be) encoded in the General Punctuation block to
    ensure that *everyone* is clear that their intended use is general, so we
    don't have to keep cloning more and more such dot combinations
    to handle the dot punctuation for each different paleographic
    tradition.

    --Ken



    This archive was generated by hypermail 2.1.5 : Thu Apr 15 2004 - 21:18:36 EDT