Re: more flexible pipeline for new scripts and characters

From: Asmus Freytag <>
Date: Wed, 16 Nov 2011 07:07:30 -0800

On 11/16/2011 6:37 AM, Peter Cyrus wrote:
> I guess what I'm proposing is that the proposed allocations be
> implemented, so that problems may be unearthed, even as the users
> accept that the standard is still only provisional.
Where "users" are programmers, such as is the case with certain
properties, such niceties are more or less understood by all parties
involved. Where users are the "public", as would be the case with
provisional implementations, you run into more issues.

Not many users are in the business of creating "test" data that can be
thrown away. Most expect any implementation to be faithful (forever) to
their data. Second, absent a firm timeline in standardization (which
prevents "bad" proposals from being held back indefinitely) implementers
would not know when they can move their "provisional" implementations to
final status for a given script.

Most implementations support more than a single script, which would mix
provisional and non-provisional data.

Test implementations can be built any time, and whether you base them on
draft documents under ballot or provisional allocations under some more
formal scheme really makes no difference. (There's been a long-standing
suggestion that people "test" characters or scripts using the private
use area. This seems to not be favored, again, because all data created
under such scheme are obsolete, once a final encoding comes out.).

What would make a difference would be the ability to have some scripts
exist in a provisional state for really extended periods, to allow all
sorts of issues to be discovered in realistic use. That, however, runs
into the problem that "users" really tend to be impatient. Once
functional implementations exist, they want to create real data.

So far. for the vast majority of characters, the existing system has
proven workable. There are a small number of mistakes that are
discovered too late to be fixed invisibly, leaving a trail of
"deprecated" characters or "formal aliases" for character names.

Overall, the number of these is rather small, given the sheer size of
Unicode, even if one or the other recent example appears to warrant more
systematic action.

Received on Wed Nov 16 2011 - 09:11:01 CST

This archive was generated by hypermail 2.2.0 : Wed Nov 16 2011 - 09:11:02 CST