Question about properties of some Code Points

From: Elisha Berns (e.berns@computer.org)
Date: Mon Jul 21 2003 - 21:33:29 EDT

  • Next message: Chris Jacobs: "Re: Question about properties of some Code Points"

    Hi,

    I have a few questions about the properties and categories of some
    punctuation characters. A few things seem counter-intuitive so
    hopefully there is a clear explanation.

    The property set Bidi_Mirrored includes pairs of parentheses that have
    left and right glyphs because their meaning changes depending on the
    direction of the text. However, Bidi_Mirrored does not include
    quotation marks which have pairs of left and right glyphs, even though
    apparently, the meaning of quotation marks changes depending on the
    direction of the text (whether it is an opening or closing quotation
    mark), similar to Bidi_Mirrored.

    So what is the reason why Quotation Marks that have pairs of glyphs are
    not included in Bidi_Mirrored?

    On the other hand Quotation_Mark is a property set exclusively for all
    the various types of quotation marks. But to discern whether it is an
    opening or closing quotation mark, you would need to check the
    categories of Open_Punctuation and Close_Punctuation???

    But Open_Punctation and Close_Punctuation only include a basic quotation
    mark, U0022, and not the ones that are "Bidi_Mirrored". And the
    categories Intial_Punctuation and Final_Punctuation only include U00AB
    and U00BB which are left and right facing quotation marks, but not all
    the variants in the Quotation_Mark property set.

    So is the membership of these properties and categories not complete?
    Or what is it I didn't get here about property/category membership?

    In case you wondered where I got this from, I checked it using both the
    UCD tables/files and the ICU Unicode Property Browser (which claims it's
    based on Unicode 4.0), both online. Barring any mistakes I have made,
    something would seem amiss.

    Where am I going with this? Basically what I'm after is a clean/clear
    way to tell if quotation marks and parentheses (plus the other
    bracketing characters such as '[' or '{' are opening or closing
    punctuation. That's the real question here! How would you do that
    using properties and categories?

    Thanks for any replies,

    Yours,

    Elisha Berns
    e.berns@computer.org
    tel. (310) 556 - 8332
    fax (310) 556 - 2839



    This archive was generated by hypermail 2.1.5 : Mon Jul 21 2003 - 22:12:19 EDT