Re: General Category of Latin subscript small letters

From: Asmus Freytag (
Date: Mon Jan 31 2011 - 21:40:08 CST

On 1/31/2011 6:39 PM, Ken Whistler wrote:
> On 1/31/2011 5:22 AM, Karl Pentzlin wrote:
>> 1. Is there a specific reason that Ll was retained for the "older"
>> characters?
> Yes and no. Yes, if you consider that "Nobody asked for them to be
> changed"
> is a specific reason. No, if you consider that "Nobody had a substantial
> reason that they evoked as requiring that the existing values be
> retained" is
> not a specific reason to *retain* them.

"If it ain't broke don't fix it" is usually fine. Here the situation has
progressed to a point where the whole system of interrelated properties
is in danger of becoming impenetrable for anyone who has not grown into
it while it was being developed. (If you doubt me, just read your other
note :) ).
>> 2. Is it a good idea to propose to change Ll to Lm for the "older"
>> characters, just for uniformity?
> ... mucking with character properties involving casing is dangerous,
> because
> the tentacles extend beyond what is immediately obvious, so it is very
> easy to run afoul of unintended consequences. (I'll elaborate in the next
> note.)
> So the issue I see is what actual *problem* is being addressed by such a
> change? Is there an implementation issue, with something to be fixed? Or
> is this just a tidiness concern?

I can't speak for Karl, but what strikes me is that this is an area that
is not just a little "untidy" but one that has collected "traps".

There are two potential solutions to that problem.

    One, would be to find a place to document the in's and outs of the
    property assignments and their "rationale" including all the
    unintended results if anyone was foolish enough to touch anything,
    whether now, or ten, fifteen years from now, when a fresh set of
    minds might be maintaining the standard.

    The other would consist of taking a hard look at the actual
    assignments and simplifying the whole system by removing historical
    accidents for which there is no enduring technical rationale. In the
    process, it is better to collect exceptions in "exception buckets"
    like "other_xxx" properties, than hiding them among "regular"
    property values.

Other_XXX properties are designed to bridge the gap between first class
properties and derived properties by collecting all the exceptions, for
example other_math contains all the characters that should have the
derived "math" property, but can't have gc=Sm for whatever reason, so
that, from the context of mathematical usage, they constitute an exception.

Finally, even after simplification, the documentation of the properties
needs to be strengthened to make sure that future maintainers understand
the way they - and their interdependencies - are constructed.
>> 3. If additional Latin subscript small letters are proposed, is
>> Lm the preferred General Category value?
> O.k. on that one there is a clear answer: Yes.


To summarize: I believe that maintainability is in itself a feature.

I'll provide some replies to some of the details in the context of your
other note.

> --Ken

This archive was generated by hypermail 2.1.5 : Mon Jan 31 2011 - 21:42:39 CST