Re: Canonical block names: spaces vs. underscores

From: Doug Ewell <doug_at_ewellic.org>
Date: Sat, 28 May 2016 09:51:55 -0600

Philippe Verdy wrote:

> However it must be clear that these aliases are case-sensitive by
> default ("Arabic_Presentation_Forms_A" is not the same as
> "Arabic_presentation_forms_A" but is the same as "Arabic
> Presentation_Forms A), unless the block names property is normatively
> said to be case-insensitive (in that case the followings are also
> aliases: "arabic_pf_a", "arabic pf a"). But adding case insensitivity
> has a cost, which is much higher than *only* allowing basic
> replacements of spaces and underscores [...]

UAX #44 says:

> 5.9.2 Matching Character Names
>
> UAX44-LM2. Ignore case, whitespace, underscore ('_'), and all medial
> hyphens except the hyphen in U+1180 HANGUL JUNGSEONG O-E.
>
> 5.9.3 Matching Symbolic Values
>
> UAX44-LM3. Ignore case, whitespace, underscore ('_'), hyphens, and any
> initial prefix string "is".

I read the words "ignore case" in these two rules to mean that case
should be ignored.

--
Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸 
Received on Sat May 28 2016 - 10:52:35 CDT

This archive was generated by hypermail 2.2.0 : Sat May 28 2016 - 10:52:35 CDT