From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Aug 09 2003 - 19:30:23 EDT
On Sunday, August 10, 2003 12:32 AM, John Cowan <cowan@mercury.ccil.org> wrote:
> Peter Kirk scripsit:
>
> > This is a clear demonstration that Microsoft also has problems with
> > the mechanism which has been defined in the standard for ten years,
>
> This is a clear demonstration that Uniscribe fails to implement a
> standard correctly, a property unique neither to Microsoft nor to the
> Unicode Standard.
Except that in that case, we are no speaking about something that has
already been standardized, but only used as a legacy mean to achieve
some results with mosre or less success. Whateer you think, the
SPACE+diacritic is still a hack, and certainly not a canonical equivalent
(including for its properties), of the existing spacing diacritics, which
also do not fit all usages because they are symbols.
The fact that there are compatibility decompositions of these spacing
diacritics is just to match those legacy uses, but it is not a solution.
It just ressembles the way many keyboard drivers allow users to enter
those spacing diacritics, but input methods and keyboard drivers are
nothing as a proof face to Unicode, as the keyboard driver will still
only return a "combined" spacing diacritic, but not the sequence
SPACE+diacritics (whose real usage in text seems to occur only in
old texts where non-spacing combining diacritics where not
encodable or renderable, or just to allow speaking in full text about
the individual diacritics themselves, a more rare case).
May be I'm wrong for this assertion, but this is my feeling and experience
about these characters, which were merely symbols or hacks to represent
non English text with a restricted ASCII alphabet as an approximate
representation (the inclusion of other spacing diacritics in the high range
of an 8-bit ISO-8859-1 encoding was very strange for me, as if they were
there only to allow approximating other missing precombined characters
which could not fit in the table, but produced poor results so that most
texts were never encoded with this charset but with other more appropriate
charsets when needed. *
* [OT] This was a shame when ISO adapted the DEC VT charset to
create ISO-8859-1, but forgot important characters needed for the
languages that this charset was supposed to cover (like the French
oe and OE ligatures, and a few characters missing for Baltic languages,
Icelandic, and Catalan.) ISO-8859-15 is certainly better now than ISO-8859-1
for the same languages and for even more than initially defined, and in
practice that's Microsoft that filled the gap with Windows1252 when
dropping the unnecessary C1 controls (forgetting the legacy roundtrip
compatibility of controls with the dying EBCDIC).
This archive was generated by hypermail 2.1.5 : Sat Aug 09 2003 - 20:02:48 EDT