On Wed, 13 Mar 2013 21:07:06 +0000
"Whistler, Ken" <ken.whistler_at_sap.com> wrote:
> Richard Wordingham wrote:
>
> > One of the changes from Version 6.1.0 to 6.2.0 of the the UCA
> > (UTS#10) was to changed weights from being 16 bits to just being
> > general non-negative integers. Was this just to accommodate the
> > 4th weight in DUCET (scheduled for deletion in Version 6.3.0), or
> > is it intended to do away with the inconvenient concept of 'large
> > weights'?
> It has nothing to do with any putatively inconvenient concept of
> large weights.
'Large weights' make it difficult (I don't say impossible) to check
UCETs for well-formedness.
> It loosened up the spec, so that the spec itself didn't seem to be
> requiring that each of the first 3 levels had to be expressed with a
> full 16 bits in any collation element table.
I don't read it that way. But it did allow the 4th weight to go up to
10FFFF! (Last explicit weight in DUCET 6.2.0 is 2A600.)
> As a matter of convenience in generation and display, the DUCET has
> always been generated using a 4 digit hex notation for the first 3
> levels. So each could be conceived as a 16-bit number, as the
> original description of collation elements implied.
>
> But in practice (and by design), the range of secondary and tertiary
> weights were constrained. You only need 9 bits to express the
> secondary weights in the table and only 5 bits to express the
> tertiary weights.
DUCET and the CLDR root are not the only UCETs. I recall nothing that
stops a tailoring needing more bits for the secondary and tertiary
weights.
> And no, nobody is "threatening" you or anybody else with "having to
> accommodate 36 bit weights".
But I can no longer turn round and say that a 36 bit weight is illegal.
> It might make sense to include a note somewhere to indicate that some
> aspects of the algorithm do implicitly assume that weights cannot
> exceed 16-bit values without requiring other adjustments to the
> algorithm.
I'm listing them at the moment.
> Section 6.2 Large Weight Values already addresses the
> approach one would take if one needs to deal with more than 64K
> primary weight values, in a way which does not break the rest of the
> algorithm.
You've just reminded me that 'escape hatch' is broken for secondary
weights. It seems a shame to me that one can't parametrically tailor
DUCET to give a rhyming dictionary sort.
Richard.
Received on Wed Mar 13 2013 - 17:37:37 CDT
This archive was generated by hypermail 2.2.0 : Wed Mar 13 2013 - 17:37:42 CDT