Re: The Unicode Standard and ISO from Marcel Schneider via Unicode on 2018-06-07 (Unicode Mail List Archive)

From: Marcel Schneider via Unicode <unicode_at_unicode.org>
Date: Thu, 7 Jun 2018 13:25:22 +0200 (CEST)

On Thu, 17 May 2018 09:43:28 -0700, Asmus Freytag via Unicode wrote:
>
> On 5/17/2018 8:08 AM, Martinho Fernandes via Unicode wrote:
> > Hello,
> >
> > There are several mentions of synchronization with related standards in
> > unicode.org, e.g. in https://www.unicode.org/versions/index.html, and
> > https://www.unicode.org/faq/unicode_iso.html. However, all such mentions
> > never mention anything other than ISO 10646.
>
> Because that is the standard for which there is an explicit understanding by all involved
> relating to synchronization. There have been occasionally some challenging differences
> in the process and procedures, but generally the synchronization is being maintained,
> something that's helped by the fact that so many people are active in both arenas.

Perhaps the cause-effect relationship is somewhat unclear. I think that many people being
active in both arenas is helped by the fact that there is a strong will to maintain synching.

If there were similar policies notably for ISO/IEC 14651 (collation) and ISO/IEC 15897
(locale data), ISO/IEC 10646 would be far from standing alone in the field of
Unicode-ISO/IEC cooperation.

>
> There are really no other standards where the same is true to the same extent.
> >
> > I was wondering which ISO standards other than ISO 10646 specify the
> > same things as the Unicode Standard, and of those, which ones are
> > actively kept in sync. This would be of importance for standardization
> > of Unicode facilities in the C++ language (ISO 14882), as reference to
> > ISO standards is generally preferred in ISO standards.
> >
> One of the areas the Unicode Standard differs from ISO 10646 is that its conception
> of a character's identity implicitly contains that character's properties - and those are
> standardized as well and alongside of just name and serial number.

This is probably why, to date, ISO/IEC 10646 features character properties by including
normative references to the Unicode Standard, Standard Annexes, and the UCD.
Bidi-mirroring e.g. is part of ISO/IEC 10646 that specifies in clause 15.1:

“[…] The list of these characters is determined by having the ‘Bidi_Mirrored’ property
set to ‘Y’ in the Unicode Standard. These values shall be determined according to
the Unicode Standard Bidi Mirrored property (see Clause 2).”

>
> Many of these properties have associated with them algorithms, e.g. the bidi algorithm,
> that are an essential element of data interchange: if you don't know which order in
> the backing store is expected by the recipient to produce a certain display order, you
> cannot correctly prepare your data.
>
> There is one area where standardization in ISO relates to work in Unicode that I can
> think of, and that is sorting.

Yet UCA conforms to ISO/IEC 14651 (where UCA is cited as entry #28 in the bibliography).
The reverse relationship is irrelevant and would be unfair, given that the Consortium
refused till now to synchronize UCA and ISO/IEC 14651.

Here is a need for action.

> However, sorting, beyond the underlying framework,
> ultimately relates to languages, and language-specific data is now housed in CLDR.
>
> Early attempts by ISO to standardize a similar framework for locale data failed, in
> part because the framework alone isn't the interesting challenge for a repository,
> instead it is the collection, vetting and management of the data.

For another part it failed because the Consortium refused to cooperate, despite of
repeated proposals for a merger of both instances.

>
> The reality is that the ISO model and its organizational structures are not well suited
> to the needs of many important area where some form of standardization is needed.
> That's why we have organization like IETF, W3C, Unicode etc..
>
> Duplicating all or even part of their effort inside ISO really serves nobody's purpose.

An undesirable side-effect of not merging Unicode with ISO/IEC 15897 (locale data) is
to divert many competent contributors from monitoring CLDR data, especially for French.

Here too is a huge need for action.

Thanks in advance.

Marcel
Received on Thu Jun 07 2018 - 06:33:23 CDT

This archive was generated by hypermail 2.2.0 : Thu Jun 07 2018 - 06:33:23 CDT