Re: Is it save to dig into comment contents of PropList.txt?

From: Markus Scherer <markus.icu_at_gmail.com>
Date: Wed, 6 Nov 2013 08:26:14 -0800

On Wed, Nov 6, 2013 at 2:43 AM, Steffen Daode <sdaoden_at_gmail.com> wrote:

> |TAB is "printable" (for the isprint() macro in standard C librries)
> because
> |it has a whitespace property, even if its general category is very weakly
>
> Nope according to POSIX, Vol. 1: Base Definitions, 7.3.1. LC_CTYPE ([1]):
>
> print
> Define characters to be classified as printable characters,
> including the <space>.
>
> In the POSIX locale, all characters in class graph shall be
> included; no characters in class cntrl shall be included.
>
> In a locale definition file, characters specified for the
> keywords upper, lower, alpha, digit, xdigit, punct, graph, and
> the <space> are automatically included in this class. No
> character specified for the keyword cntrl shall be specified.
>

There is a Unicode spec for these properties:
http://www.unicode.org/reports/tr18/#Compatibility_Properties

ICU should be implementing that, for example
[:print:]<http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%3Aprint%3A%5D&g=>

markus
Received on Wed Nov 06 2013 - 10:29:21 CST

This archive was generated by hypermail 2.2.0 : Wed Nov 06 2013 - 10:29:25 CST