From: Asmus Freytag (firstname.lastname@example.org)
Date: Fri Apr 17 2009 - 18:56:13 CDT
On 4/17/2009 4:30 PM, Doug Ewell wrote:
> Asmus Freytag <asmusf at ix dot netcom dot com> wrote:
>> Example: most traffic symbols like DEER CROSSING or SPEED LIMIT 30
>> should probably not be encoded as characters. The STOP sign or the
>> European CAUTION sign, however, are examples of common symbols, that
>> deserve status as characters. You find them as part of texts where
>> they retain their customary shape, but don't refer to traffic, but
>> are used in a generalized sense. Hence, they have become _common_
> The stop sign, like "pictures of cows," is another canonical example
> presented in the WG2 "Principles and Procedures" document (updated
> less than a year ago) of what should *not* be encoded. It's
> interesting to see further evidence of how loosely the principles are
> applied, in spite of all the protests that UTC is following the same
> principles in encoding emoji that it followed two decades ago.
If that kind of thing amuses you, try reading the introduction to the
Unicode Standard. The early versions boldly proclaim many things off
limits that later happened. From 32-bit character codes to Musical symbols.
I don't see this as problematic. Many of these changes are the direct
consequence of Unicode's success. Rather than shoehorn everybody and
everything mercilessly into the 1988 view of what a global, universal
character set should be, the developers of the standard have wisely
adapted to critical needs and allowed the standard to reflect the
experience gained in developing and implementing it. That's an
unquestioned strength of the Unicode Standard.
In that process, the principles have acted and continue to act as
valuable guide posts. Ideally, all coding problems and needs can be
covered within the boundaries demarcated by them. When that's not
possible, a critical and thorough evaluation is performed that looks at
whether a problem is important enough to address at all, but also
whether it should give rise to an exception, or to a reformulation of
For those areas where users and implementers MUST be able to rely on
enforceable restrictions, you have the Unicode Stability Guarantees.
There you have critical rules that MUST NOT be violated by changes in
the standard. But there's a reason that principles and stability
guarantees are not one and the same thing.
>> It's not sufficient to just point at sets of symbols for that - you
>> also need to isolate which ones are _common_ symbols in each set,
>> according to the definition of this concept that I've proposed here.
> Unless they can be defined as "compatibility characters," in which
> case all of them must be encoded without question.
Unless the set has been approved as a compatibility character set, in
which case, the goal is, indeed, to cover it in full.
(That decision does not rest with the proposer, no matter how much you
would like to insinuate it.)
The "sets of symbols" I was addressing in that part of my message,
however, did not include compatiblity character sets, but sets organized
by category or type of symbol, like ISO safety symbols, UI symbols, etc.
PS: I've removed the Emoji list from the cc, since this discussion did
not get started there, nor is it specific about the emoji proposals.
This archive was generated by hypermail 2.1.5 : Fri Apr 17 2009 - 18:58:36 CDT