Re: Comparing Raw Values of the Age Property from Anshuman Pandey via Unicode on 2017-05-22 (Unicode Mail List Archive)

From: Anshuman Pandey via Unicode <unicode_at_unicode.org>
Date: Mon, 22 May 2017 17:19:08 -0500

I performed several operations on DerivedAge.txt a few months ago. One basic example here:

https://pandey.github.io/posts/unicode-growth-UCD-python.html

If you provide some more insight into your objective, I might be able to help.

I would recommend against relying on the order of the data, and that you instead parse the individual entries to obtain the 'Age' property.

All my best,
Anshu

> On May 22, 2017, at 4:44 PM, Richard Wordingham via Unicode <unicode_at_unicode.org> wrote:
>
> Given two raw values of the Age property, defined in UCD file
> DerivedAge.txt, how is a computer program supposed to compare them?
> Apart from special handling for the value "Unassigned" and its short
> alias "NA", one used to be able to compare short values against short
> values and long values against long values by simple string
> comparison. However, now we are coming to Version 10.0 of Unicode,
> this no longer works - "1.1" < "10.0" < "2.0".
>
> There are some possibilities - the values appear in order in
> PropertyValueAliases.txt and in DerivedAge.txt. However, I can find no
> relevant guarantees in UAX#44. I am looking for a solution that can be
> driven by the data files, rather than requiring human thought at every
> version release. Can one rely on the FULL STOP being the field
> divider, and can one rely on there never being any grouping characters
> in the short values? Again, I could find no guarantees.
>
> Richard.
Received on Mon May 22 2017 - 17:19:27 CDT

This archive was generated by hypermail 2.2.0 : Mon May 22 2017 - 17:19:28 CDT