Re: NamesList.txt as data source

From: Philippe Verdy <>
Date: Sun, 27 Mar 2016 23:04:55 +0200

Le 27 mars 2016 20:47, "Doug Ewell" <> a écrit :
> Asmus Freytag wrote:
>> Nobody disputes that subheaders are informative. However, subheaders
>> do not define a character property.
> Janusz was making a point that the CLDR data sometimes treats them as
such, or at least as a kind of supplementary property.

I'm very curious about where CLDR data depends on these subheaders or other
annotations in NamesList.txt...

Subheaders may only be used eventually as named anchors splitting a
normative block onto several subparts (somtimes with several parts on the
same heading) but thèse subblocks are not normative, notably because they
are not correlated with other subbocks in additional blocks. And there's
not even any warranty that cbaracters in these subblocks share some basic
property, not even a script type, or a général category. Thase are juste
anchors for speaking about subblocks, and relatés to the discussions that
occured before these characters were encoded.
If mater there are new characterd added these existing subblocks won't be
sufficient. But the new characters will ne added at any convenient range
available or in a new block. If needed, even these subblocks may ne
subdivisée and thus renamed. None of them are stable.

For CLDR algorithms and data, these headings are not necessary and not
used. Instead, character ranges or sets are used, specifying the characters
directly, or one oor more of their properties in cimbinations but not this

I juste hope that there's no algorithm depending on them and treating them
as properties (for exemple in regular expressions with a custom property).
If an algorithme must be created, it should define its own named subsets to
d'Égine their own properties (many UAX algorithms do that constantly, e.g
for text breakers or Bidi or text transforms)
Received on Sun Mar 27 2016 - 16:06:11 CDT

This archive was generated by hypermail 2.2.0 : Sun Mar 27 2016 - 16:06:12 CDT