On Mon, Mar 5, 2012 at 7:29 PM, Philippe Verdy <verdy_p_at_wanadoo.fr> wrote:
> You can do that if you wish. This is part of the standard. Look at the
> existing canonical decomposition mappings in the UCD (or just look at
> them in the charts which display them). Note that this will not make
> any difference for all conforming Unicode processes.
>
> For example you can freely normalize texts to the NFD form (even if
> this form is not recommanded in many interchange protocols like HTML).
>
> Le 5 mars 2012 18:33, Denis Jacquerye <moyogo_at_gmail.com> a écrit :
>> Hi,
>>
>> Could the following be decomposed instead of being encoded as single characters?
>> COMBINING LATIN SMALL LETTER A WITH DIAERESIS
>> COMBINING LATIN SMALL LETTER O WITH DIAERESIS
>> COMBINING LATIN SMALL LETTER U WITH DIAERESIS
Philippe, I'm talking about the combining diacritics to be added in
the Combining Diacritical Marks Supplement block.
These are <comb>ä</comb>, etc.
I should have been clear about what characters I am talking about and
pointed to the fact that as they currently are, these are not
decomposable. Besides, if they were decomposable, would they be
encoded as single characters?
See the proposal http://std.dkuug.dk/JTC1/SC2/WG2/docs/n4081.pdf or
the draft http://std.dkuug.dk/JTC1/SC2/WG2/docs/n4244.pdf where they
are encoded as single characters. Neither mentions decomposition.
My question really is whether they could not be seen as
<comb>a<comb><comb>diaeresis</comb>, etc. Where the shape of
<comb>diaeresis</comb> is contextual.
---- Denis Moyogo JacqueryeReceived on Mon Mar 05 2012 - 12:51:28 CST
This archive was generated by hypermail 2.2.0 : Mon Mar 05 2012 - 12:51:37 CST