At 1999-10-21 20:50, G. Adam Stanislav wrote:
>Yes, it is possible to encode the CH as the C followed by the H, and the N
>caron by the N followed by some connection code followed by a caron.
Actually, N is encoded with as a codepoint for N followed by a codepoint
for 'combining caron'. These 'combining codepoints' modify the character
suggested by the previous codepoint, more or less.
>And it
>is perfectly possible for software to handle it. But that would not be
>CHARACTER encoding.
Unicode is intended to encode text as character-streams, rather than
glyphs, but it certainly does not in general encode one character per
codepoint.
N-caron is a character.
Unicode encodes characters.
Unicode encodes N-caron as a sequence of two codepoints.
Now some characters, such as 'M', can be encoded using one codepoint.
Some, such as 'à' (a-grave), can be encoded in several ways.
>Unicode clearly states its goal to be the encoding of
>characters of all languages, existing and defunct.
Correct.
>CH is a character is in Slovak.
Correct. Unicode encodes that Slovak character as U+0043 U+0048.
-- Ashley Yakeley, Seattle WA
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT