From: Doug Ewell <>
Date: Sat, 26 Jul 2014 09:26:21 -0600

fantasai <fantasai dot lists at inkedblade dot net> wrote:

> I think when you have no further context, it is better to have
> a guess informed by the character properties than one completely
> ignorant of them.

Some of the responses on this list already demonstrate a real risk of
Unicode adding a property like this. When Unicode publishes this sort of
data, even if it is meant to be informative, people tend to treat it as
normative and rigid, and applying to all imaginable scenarios.

So even for a script like Latin, where the customary method of
justification is usually straightforward, you can have reasonable
counterexamples like Fraktur as described by Asmus. And then someone
might bring up a case where the rules might be different for different
languages (Philippe sort of alluded to this with Arabic). And then there
will be a historic example from the dawn of printing, and one from a
highly styled advertising sign, and so forth, and it will be hard to
tell when the "normal usage" line has been crossed. If necessary,
someone will trudge out Latin letters on a neon sign, oriented normally
but written vertically down the sign. Meanwhile Unicode will be
criticized for not taking all the special cases into account.

It's a bit like the locale collections (CLDR is not alone here) that
specify a single date format for an entire country, as if all Americans
only ever write a short date as "m/dd/yy" and anyone who uses a
different format is employing some sort of weird hybrid system. The
presence of "m/dd/yy" in the locale collection appears normative and
rigid, and is often implemented in software as though that were the
intent, even if the data is meant to be descriptive and a first

