When an standard conformaing SQL-implementation concatenates two normalized
UCS strings, then it is required that the result be normalized (noting
Unicode Standard Annex #15 Unicode Normalization Forms, Concatenation).
My question is, supposing the NF of the two operands to be different, what
should be the NF of the result?
In its present state, our proposal specifies the result by referring to the
following table:
Table A
=======
|Operand 2
Operand 1 |NFKD NFKC NFD NFC
-----------------+------------------------
NFKD |NFKD NFKC NFD NFC
NFKC |NFKC NFKC NFD NFC
NFD |NFD NFD NFD NFC
NFC |NFC NFC NFC NFC
It has been suggested that the following would be preferable:
Table B
=======
|Operand 2
Operand 1 |NFKD NFKC NFD NFC
-----------------+------------------------
NFKD |NFKD NFKC NFKD NFKC
NFKC |NFKC NFKC NFKD NFKC
NFD |NFKD NFKD NFD NFC
NFC |NFKC NFKC NFC NFC
I have no confident opinion on this, and don't believe I could form one
without more practical experience than I'm ever likely to have. My very
tentative opinion, for what it's worth, is based on a preference for NFC
over NFKC.
Any offers?
Mike.
***********************************************************
J M Sykes Email: Mike.Sykes@acm.org
97 Oakdale Drive
Heald Green
CHEADLE
Cheshire SK8 3SN
UK Tel: (44) 161 437 5413
***********************************************************
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:18 EDT