Re: What is the principle?

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Mar 31 2004 - 16:39:00 EST

Next message: jcowan@reutershealth.com: "Doing Markup in Plain Text: A Modest Proposal for Planes 4-B of Unicode"

Previous message: Kenneth Whistler: "Re: What is the principle?"
In reply to: Ernest Cline: "Re: What is the principle?"
Next in thread: Mark E. Shoulson: "Re: What is the principle?"
Reply: Mark E. Shoulson: "Re: What is the principle?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

From: "Ernest Cline" <ernestcline@mindspring.com>
> I'd have to take the time to list them, but a quick glance convinces
> me that there are at most several hundred combinations that would
> need to be supported if we limit things to just those combinations
> already in use. (it might take more, if for example all 256 potential
> combining classes were supported instead of the 26 listed in
> UCD.html), At 128 characters per combination plus more for a
> few that might need them, it should prove possible to handle this
> in 1 or 2 planes.

This seems highly excessive. We already have plenty of PUA space. All what we
need is a standard way (file format? protocol?) to transport PUA character
properties, and possibly encode a reference (URI?) to the definition file or
service. If Unicode does not want to do this job, at least it could participate
in such independant development by commenting about the protocol/format used to
encode these properties (notably to make sure that the system remains extensible
and can encode new properties that may be added later).

This would work in relation with the evolution of the Unicode standard itself
(versioning) which may be handled correctly (however less efficiently) through a
sort of emulation layer that would "mimic" the behavior of new standardized
characters and properties. I won't expect that every application will be able to
interpret this protocol or implement the emulation layer, but at least it
becomes possible to create less ambiguous interoperable solutions based on other
existing standards (that's why I think that, if such separate development is
created, it should be based on the most advanced interoperability technologies
of today, notably XML and its schemas and namespaces).

You think this is overkill? Well in some near future, I think that it will be
difficult for applications to follow the evolutions of the Unicode standard, and
differences of versions will cause soon a nightmare if there's no more formal
way to specify what is implicitly part of a Unicode version (and does not need a
complex negoctiation of protocol) clearly identified by a identifier resolvable
by online services, and what can be supported the most completely as possible by
an emulation layer. XML schemas, because they are versionnable, can really help
here (notably because of the capability of modern XML parsers to use local
caches for definition data, including local prebuilt-in implementations which
are the most efficient).

So I don't like the idea of adding more PUAs with other defaults. I much favor
some more fredom on the use of PUAs, and a way to make what looks like a
deviation of the standard today, a now conforming solution.

It will become more important with the remaining scripts to encode, simply
because we really lack some resources to be able to produce any standard for
them. What this means is that the evolution of Unicode will soon become
impossible without experimentation and gradual integration with some
interoperable services. With the current standard stability policy, this need is
even more important because further corrections of past errors will become
nearly impossible (and so this will stop any attempt to make significant
evolutions to the standard itself).

It's clear that there are needs for PUAs today, just because Unicode is becoming
an universal standard for more and more applications. If this universal standard
blocks evolution, then others will want to develop indepant standards and there
will be a risk of splits caused by OS vendors themselves.

(see what has happened 15 years ago to Unix, and the high difficulty today to
reunify what was initially a unique standard; thanks GNU and Linux have been the
motors and such reunification, because other proprietary *nix versions are now
converging for interoperability with Linux; but this unification is probably 15
to 20 years before it becomes true, unless *nix vendors decide to abandon
prememtively some "dead" branches to keep only those that users want and are
ready to learn and support themselves).

Next message: jcowan@reutershealth.com: "Doing Markup in Plain Text: A Modest Proposal for Planes 4-B of Unicode"
Previous message: Kenneth Whistler: "Re: What is the principle?"
In reply to: Ernest Cline: "Re: What is the principle?"
Next in thread: Mark E. Shoulson: "Re: What is the principle?"
Reply: Mark E. Shoulson: "Re: What is the principle?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Mar 31 2004 - 17:27:55 EST