Re: Unicode certification - was RE: Dublin Conference:

From: Tex Texin (
Date: Thu Jul 25 2002 - 01:14:35 EDT


Why couldn't a checklist be established for each of the functionalities
that you mention, which a product could score itself against for
conformance, over a state range of supported characters?

Recently, I did a search for a product, and it was difficult to know
which scripts were supported and whether it had the Unicode capabilities
I was concerned with. It would have been nice if there was a statement
of self-compliance that indicated whether or not they supported:

Character ranges-
 broken into reasonable subgroups:
Preservation of unicode characters
Combining characters:
normalization forms:

I think if there were such a checklist with suitable definitions and/or
conformance requirements, vendors that had done the work to support
Unicode properly would be glad to declare it in their product specs or

And there are probably many product developers that think they support
Unicode but in fact don't and such a checklist would help make them
aware of what else they need to do.

And if they misadvertised or reported incorrectly, I am sure their
customers would be glad to inform them of their oversight thru their
support lines or by announcement to the appropriate user group lists.

Sure there will be some grey areas based on particular product
functionality, but it would still be a far better situation then we have

David Starner wrote:
> At 11:24 AM 7/24/02 -0700, David Possin wrote:
> >It would be intereting and helpful to be able to find out if a product
> >is Unicode-compliant before purchasing it.
> The problem is too broad to be neatly solved. It's not like compliance
> to the Ada standard, where you can just write a bunch of test code for
> all compilers. You'd have to adapt the tests for each program including
> writing code customized for each interpreter or compiler.
> And after you've done this, you know that it can round-trip arbitrary
> Unicode and that it treats the characters as Unicode characters and not,
> say, Latin-1 or SJIS. You don't know whether it can handle combining
> characters or not, or whether or not it can handle any particular characters
> beyond just not messing with them. A program could pass with debilitating
> flaws for any real Unicode use, and still be Unicode complaint. Seems like a
> lot of work for little gain.

Tex Texin   cell: +1 781 789 1898
Xen Master                
Making e-Business Work Around the World

This archive was generated by hypermail 2.1.2 : Wed Jul 24 2002 - 23:12:38 EDT