Re: UAX44: loose matching of symbolic values and the `is` prefix

From: Asmus Freytag (c) <asmusf_at_ix.netcom.com>
Date: Mon, 6 Jun 2016 09:48:27 -0700
On 6/6/2016 9:09 AM, Markus Scherer wrote:
Interesting discussion!

ICU does not support "is" nor "in" prefixes. I wasn't even aware that UAX #44 loose matching prescribes "is". ICU just implements what Property[Value]Aliases.txt say:

# Loose matching should be applied to all property names and property values, with
# the exception of String Property values. With loose matching of property names and
# values, the case distinctions, whitespace, hyphens, and '_' are ignored.

The prefixes seem gratuitous and confusing. For example, if I read UAX44-LM3 right, it would allow [:isscript=isgreek:].

We do support just [:Greek:] for scripts and [:L:] for general categories.

I would rather not add support for the prefixes in ICU.

markus

There is a difference in guaranteeing that "is" is not the leading part of a property value alias and in supporting a match. I agree that requiring (or suggesting) such a thing is questionable. (Esp. in light of what ICU does).

However, making sure that those that follow that conventions can continue to do so with future aliases *is* reasonable.

A./

Received on Mon Jun 06 2016 - 11:48:46 CDT

This archive was generated by hypermail 2.2.0 : Mon Jun 06 2016 - 11:48:46 CDT