Re: Unicode encoding policy

From: Asmus Freytag <>
Date: Tue, 23 Dec 2014 22:08:32 -0800

On 12/23/2014 1:51 PM, Doug Ewell wrote:
> William_J_G Overington <wjgo underscore 10009 at btinternet dot com>
> wrote:
>> 5. Are the proposed characters in current use by the user community?
>> No
>> ----
>> This appears to be a major change in encoding policy.
>> This, in my opinion, is a welcome, progressive change in policy that
>> allows new characters for use in a pure electronic technology to be
>> added into regular Unicode without a requirement to first establish
>> widespread use by using an encoding within a Unicode Private Use Area.
> It is exactly the change I was worried about, the precedent I was afraid
> would be set.

Requiring long-term use of characters at an alternate code location
always struck me as counter-productive, because it becomes disruptive at
the point where some character finally has been established. In contrast
to true "experimental" use.

Therefore, recognizing that for some code points there can be critical
mass of implementation support straight from the moment of publication
is useful.

This is definitely not the same as saying that any idea, however
half-baked, of a new symbol should be encoded 'on-spec' to see whether
it garners usage.

The "critical mass" of support is now assumed for currency symbols, some
special symbols like emoji, and should be granted to additional types of
symbols, punctuations and letters, whenever there is an "authority" that
controls normative orthography or notation.

Whether this is for an orthography reform in some country or addition to
the standard math symbols supported by AMS journals, such external
adoption can signify immediate "critical need" and "critical mass of
adoption" for the relevant characters.

In these case, to require years of PUA code usage is, to repeat,
counterproductive. It doesn't alter the fact that the codes will
eventually be needed (unless one were to confidently expect failure of
some reform) and only leads to the creation of data in the meantime that
have to be converted or cannot be accessed reliably.

A clear-cut recognition by the UTC (and WG2) of this particular dynamic
(beyond currency codes) would be helpful -- particularly as Unicode has
matured to the point of being the only game in town. The current
methodology of researching typeset data is well suited to the encoding
of existing or historic practice, but ill-suited to dealing with ongoing
development of scripts and symbol sets.

Taking this new stance makes it easier to contrast it with hobbyists,
enthusiasts and individual tinkerers attempts at inventing a better
world through symbols or new letters. These latter cases lack both
"critical need" as well as "critical mass" unless they are first adopted
by much larger (and/or more authoritative) groups of users.

There is an inherent risk that large groups of users can follow "fads"
that require certain symbols that see huge usage for a while and then
get abandoned. While this is hard to predict, it is not that different
from historical changes in writing systems - even if the trends there
played out over longer time frames.

>> I feel that it is now therefore possible to seek encoding of symbols,
>> perhaps in abstract emoji format and semi-abstract emoji format, so as
>> to implement a system for communication through the language barrier
>> by whole localizable sentences, with that system designed by
>> interested people without the need to produce any legacy data that is
>> encoded using an encoding within a Unicode Private Use Area.
> Sadly, I can no longer state with any confidence that such a proposal is
> out of scope for Unicode, as I tried to do for a decade or more.
> --
> Doug Ewell | Thornton, CO, USA |
> _______________________________________________
> Unicode mailing list

Unicode mailing list
Received on Wed Dec 24 2014 - 00:10:01 CST

This archive was generated by hypermail 2.2.0 : Wed Dec 24 2014 - 00:10:03 CST