Re: Parsers for the UnicodeSet notation?

From: Steven R. Loomis <srl_at_icu-project.org>
Date: Wed, 23 Jul 2014 16:18:20 -0700

On 07/23/2014 03:28 PM, Roozbeh Pournader wrote:
> On Wed, Jul 23, 2014 at 3:23 PM, Eric Muller <emuller_at_adobe.com
> <mailto:emuller_at_adobe.com>> wrote:
>
> I would like to work with the exemplarCharacters data in the CLDR.
> That uses the UnicodeSet notation. Is there somewhere a parser for
> that notation, that would return me just the list of characters in
> the set?
>
>
> Note that it's a set of strings, not characters.
>
> I suspect that the exemplarCharacters use a restricted form of the
> UnicodeSet notation (e.g. do not use property values). Is that
> correct, and if so, what's the subset?
>
>
> I have an Apache-licensed parser in Python here:
> https://code.google.com/p/noto/source/browse/nototools/generate_website_data.py#180
>
Nice, you should get those CLDR folks to add a link! I'm cross posting
this to cldr-users, which may be more appropriate.

Eric, to answer your second question, the TR35 spec does not say that
exemplars are a restricted set, as per
http://unicode.org/repos/cldr/trunk/specs/ldml/tr35-general.html#ExemplarSyntax
- in practice, a restricted set is used, ranges are expanded. But
there's no guarantee of this by the spec.

-s

-- 
IBMer but all opinions are mine.
https://www.ohloh.net/accounts/srl295 // fingerprint @ https://ssl.icu-project.org/trac/wiki/Srl
_______________________________________________
Unicode mailing list
Unicode_at_unicode.org
http://unicode.org/mailman/listinfo/unicode

Received on Wed Jul 23 2014 - 18:20:12 CDT

This message: [ Message body ]
Next message: Richard Wordingham: "Re: Request for Information"
Previous message: Steven R. Loomis: "Re: Parsers for the UnicodeSet notation?"
In reply to: Roozbeh Pournader: "Re: Parsers for the UnicodeSet notation?"
Next in thread: Steven R. Loomis: "Re: Parsers for the UnicodeSet notation?"

Mail actions: [ respond to this message ] [ mail a new topic ]
Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

This archive was generated by hypermail 2.2.0 : Wed Jul 23 2014 - 18:20:12 CDT