U+015F (ş) vs. U+0219 (ș)
From: Eric Muller (emuller@adobe.com)
Date: Fri Feb 22 2002 - 13:30:50 EST
U+015F LATIN SMALL LETTER S WITH CEDILLA (ş) has the following annotations:
- Turkish, Azerbaijani, Romanian, ...
- this character is used in both Turkish and Romanian data
- a glyph variant with comma below is preferred for Romanian
and there is a cross-reference to U+0219 LATIN SMALL LETTER S WITH COMMA
BELOW (ș), which has the following annotation:
- Romanian, when distinct comma below form is required
Those characters have the expected canonical mappings, with combining cedilla
and combining comma below respectively, so they are entirely distinct characters
as far as Unicode is concerned. However, the last annotation on U+015F suggests
they are the same. What is the truth?
- Is a glyph with a comma below a correct representation of U+015F, as
the annotation suggests? Of course, such a font would not be usable for languages
other than Romanian
- Should the annotations be interpreted (and may be changed) to something
like: "U+015F is not used in Romanian, you are probably looking for U+0219;
however, data encoded prior to Unicode 3.0 may have incorrectly used U+015F
instead of U+0073 U+0326"?
Thanks,
Eric.
This archive was generated by hypermail 2.1.2
: Fri Feb 22 2002 - 13:07:47 EST