From: Peter Constable (firstname.lastname@example.org)
Date: Fri Apr 14 2006 - 17:17:23 CST
> From: Philippe Verdy [mailto:email@example.com]
> You have excluded this one in your list:
> * U+1B0E = <U+1B0D ; U+1B35>
While you might suppose that to be the case, I did not exclude anything from my list. My list was exactly the decompositions UTC had approved: nothing more, nothing less.
> My opinion is that this is part of the set
That may be your opinion, but it is not what is approved for standardization.
> One could still want the non ligatured form by explictly coding
Based on the information provided from the user community, there is no "non ligatured form". There is the atomic character shown in the chart. There is no equivalence with sequences involving 1B0D and 1B35.
> And the balinese name of U+1B0E is clear: it gives the interpretation for
> native Balineses and they may be confused by the fact that, without this
> canonical decomposition, the "LETTER LA LENGA TEDUNG" (vocalic long l)
> will be considered different from "LETTER LA LENGA" (vocalic l) followed
> by a "VOWEL SIGN TEDUNG" (long vowel mark).
That is precisely the understanding that would be intended. I wouldn't refer to that as confusion.
> Are there reasons to keep
> these two sequences distint?
Yes: they are not canonically equivalent.
Your entire argument here seems to me to be built on the assumption that the average user wants to represent their characters in multiple ways: as a single character here and as a character sequence there. I've yet to meet a user like that. This argument simply isn't valid. What *would* have been a valid argument supporting a decomposition mapping would have been evidence that the text element in question is sometimes written as two discontiguous graphic elements resembling the combination of 1B0D and 1B35, but apparently that does not happen.
This archive was generated by hypermail 2.1.5 : Fri Apr 14 2006 - 17:18:49 CST