Re: Characters consisting of vertical lines; Possible attempts to encode tally marks

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Wed Feb 28 2007 - 10:33:18 CST

  • Next message: Asmus Freytag: "Re: Characters consisting of vertical lines; Possible attempts to encode tally marks"

    On 2/27/2007 4:32 PM, Karl Pentzlin wrote:
    > 1.) There are at least three series of characters consisting of one or
    > more vertical line sequences in Unicode (besides script specific
    > characters like Latin click letters or dandas):
    > a.) U+007C VERTICAL LINE (1 line)
    > U+2016 DOUBLE VERTICAL LINE (2 lines)
    > U+2AFC LARGE TRIPLE VERTICAL BAR OPERATOR (3 lines)
    > b.) U+2223 DIVIDES (1 line)
    > U+2225 PARALLEL TO (2 lines)
    > U+2AF4 TRIPLE VERTICAL BAR DELIMITER (3 lines)
    > c.) U+1D369...1D36D COUNTING ROD TENS DIGIT ONE ... FIVE (1 ... 5
    > lines)
    >
    > Of these, the characters of series c. are clearly intended to have
    > the same height, line thickness and distance between the lines.
    >
    barbara beeton from the American Mathematical Society (AMS) pointed out
    offline that item 1 has got the wrong distribution for a and b.
    > here's what it ought to be:
    >
    > a.) U+007C VERTICAL LINE (1 line)
    > U+2016 DOUBLE VERTICAL LINE (2 lines)
    > U+2AF4 TRIPLE VERTICAL BAR DELIMITER (3 lines)
    > b.) U+2223 DIVIDES (1 line)
    > U+2225 PARALLEL TO (2 lines)
    > U+2AFC LARGE TRIPLE VERTICAL BAR OPERATOR (3 lines)
    >
    > the items in a are used for delimiters or
    > fenceposts, as in
    > |x|, ||x||, |||x||| or < a | b >
    > in mathematical layout they increase in size as the
    > expression gets taller.
    >
    > the items in b are operators, always between
    > two elements, as in
    > x | y, x || y, or x ||| y
    > actually, they too should be able to get
    > taller if the elements they're between happen
    > to be something like fractions, but the meaning
    > and spacing are quite different from the others
    Jon Hanna replied to the list
    > f. "divide divide divide" does not mean 3.
    >
    Couldn't agree more. He continues:
    > tallies are either a dynamic way to keep a tally and as such cannot be
    > meaningfully used in static text, or are used stylistically as glyph
    > variants of the numbers 1 through 5 which are already encoded at
    > U+0031 through U+0035. Hence we don't need to encode them.
    The model used in Unicode for the character-glyph relation for digits is
    not a good precedent, although it is somewhat complicated in itself.
    Variations in digit shapes that are typical for a script difference are
    encoded separately, while in-script (e.g. regional) variations are not
    always encoded separately. I don't think that the argument that these
    are digits is particularly appealing, actually.

    However, I would need to see more evidence that tallies are used in
    plain text, before determining that there is a need to address the
    question of how they are best encoded (with existing or new characters,
    or not as characters of any type).

    If there was a danger of people usurping existing characters to mock up
    tally marks (e.g. existing vertica lines of some system, and long
    combining solidus overlay for example), then it might also be
    appropriate to defensively encode specific characters - but the case for
    such a decision has not been made.

    A./



    This archive was generated by hypermail 2.1.5 : Wed Feb 28 2007 - 10:35:20 CST