Re: Does Unicode 4.1 change NFC?

From: Peter Kirk ([email protected])
Date: Tue Apr 05 2005 - 03:33:26 CST

Next message: Raymond Mercier: "Re: Macrons"

Previous message: Marcin 'Qrczak' Kowalczyk: "Re: Does Unicode 4.1 change NFC?"
In reply to: Kenneth Whistler: "Re: Does Unicode 4.1 change NFC?"
Next in thread: Arcane Jill: "Re: Does Unicode 4.1 change NFC?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 05/04/2005 00:22, Kenneth Whistler wrote:

>John Burger asked:
>
>
>
>>>>The problem will of course come when new UCD data is fed into an old
>>>>normaliser.
>>>>
>>>>
>>>Actually, it will not. If a Unicode normalizer was a Unicode 4.0
>>>normalizer, it will *stay* a Unicode 4.0 normalizer.
>>>
>>>
>>Even if it is fed new ==UCD== data?
>>
>>
>
>It depends on what Peter Kirk meant by a "normaliser" and
>by "UCD data".
>
>If by "normaliser" he means a normalizer generator that takes
>UCD data files as input and generates a normalizer process that
>corresponds to the version of UCD data files, then of course
>what you input matters.
>
>If by "normaliser" he means an already implemented normalizer
>process and by "new UCD data" he means text data corresponding
>to the new version of Unicode, then the behavior of the
>normalizer should not change.
>
>

What I mean is a program which makes a proper separation between program
and data, which implements the Unicode normalisation *algorithm* (for a
particular version of Unicode) but uses the Unicode character *data*, as
well as the text data to be normalised, as part of its input. I don't
know of any normalisation program which works in this way, and in this
case efficiency may override good programming practice - although it
should be possible to compile the UCD normalisation data in a way which
can be used efficiently. But I do know of other programs which
effectively update themselves automatically with the latest version of
the UCD.

Of course if the algorithm is changed from one version of Unicode to
another, as it was when NormalizationCorrections.txt was added to the
standard, then the program needs to be updated, and the results of using
the new UCD data with the old algorithm are unlikely to be correct. But
from 4.0.0 to 4.1.0 there has not, I think, been an advertised change to
the algorithm, and so people might expect the normalisation program to
continue to work. I agree that they should test it before use with a new
version of Unicode, but I don't believe that all programmers are as
careful as Doug and Jill in such matters.

There is a particular danger with the new fashion of programs
automatically updating themselves over the Internet - and sometimes
breaking themselves in the process, as I have discovered to my cost.

-- 
Peter Kirk
[email protected] (personal)
[email protected] (work)
http://www.qaya.org/
-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.1 - Release Date: 01/04/2005

Next message: Raymond Mercier: "Re: Macrons"
Previous message: Marcin 'Qrczak' Kowalczyk: "Re: Does Unicode 4.1 change NFC?"
In reply to: Kenneth Whistler: "Re: Does Unicode 4.1 change NFC?"
Next in thread: Arcane Jill: "Re: Does Unicode 4.1 change NFC?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Apr 05 2005 - 03:40:12 CST