Re: Normalization Form KC for Linux

From: peter_constable@sil.org
Date: Sun Aug 29 1999 - 16:23:57 EDT


>Well the Swedish alphabet includes but not . The last is an
       "a with an accent above", while is a letter in itself. Just
       because the glyph looks like an "a with two dots above", it is
       not. It is wrong to decompose a character just because its
       glyph looks like being composed of an accent and an other
       character.

       While I can appreciate that a Swede wants to experience certain
       behaviour for a-umlaut, or any other character - and I agree
       that it should be this way, there is a fallicy in the argument
       here: There is an invalid assumption that the behaviour
       experienced by a user is necessarily determined by the way data
       is encoded. Certainly the encoding of the data *may* impose
       limitations on what behaviour it is possible to provide to the
       user, but in the case of whether to encode a-umlaut as a single
       character code or as a sequence of two character codes, no
       limititations of consequence are involved. With either approach
       to encoding, it remains for implementers to provide the correct
       behaviour with regard to presentation, input and editing, case
       mappings, searching, sorting, etc. With either approach to
       encoding, there is no reason at all why an implementer can't
       provide the kind of user experience that a Swede would expect.

       Again, this situation is no different from, e.g., Spanish ch.

       Peter



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT