From: Jukka K. Korpela (firstname.lastname@example.org)
Date: Wed Jun 01 2005 - 06:17:36 CDT
(I took the liberty of changing the Subject, since this isn't really about
"Glagolitic in Unicode 4.1" any more.)
On Tue, 31 May 2005, Philippe Verdy wrote:
> From: "Страхиња Радић" <email@example.com>
> > By using this kind of reasoning, we would end up asking why the heck
> > was ``fi'' or ``ffi'' encoded when these two can be expressed with their
> > corresponding atoms
> Today, they would not be encoded.
I think they would be encoded even today, due to their presence in other
character codes. But they would not be encoded, and would not have been
encoded, without such background.
> - - ligature processing is a required feature to support
> even legacy ISO 8859 charsets like Arabic, or Indian standard charsets
Pardon? In which sense is ligature processing _required_? Do you mean that
it is forbidden now to render "f" followed by "i" as two letters, without
using a ligature? I don't see how an application would even be required to
be _capable_ of using a ligature.
> Unicode however cannot remove those characters.
That's certainly true, due to the policy of never removing any characters.
> They remain there for
> compatibility, they are not recommanded,
Is there any explicit statement in the Unicode standard that says that the
ligatures should not be used?
> they are considered compatibility
> characters with canonical decompositions,
No, characters like U+FB01 LATIN SMALL LIGATURE FI have _compatibility_
decompositions. This means that replacing a ligature with the
decomposition may remove formatting information - as it surely does.
> and not part of normalized forms,
They are part of normalization forms C and D, which involve canonical
decomposition but not compatibility decomposition.
> because their plain-text semantic is strictly equal to the semantic of their
> decomposition in any human languages that use them.
There is a difference in meaning as regards to rendering: U+FB01 very
clearly says it's a ligature, whereas "fi" may or may not be rendered as a
-- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Wed Jun 01 2005 - 06:21:14 CDT