Ligatures fi and ffi

From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Wed Jun 01 2005 - 06:17:36 CDT

Next message: Dominikus Scherkl: "AW: Ligatures fi and ffi"

Previous message: Michael Everson: "Re: browser encoding settings"
Next in thread: Dominikus Scherkl: "AW: Ligatures fi and ffi"
Reply: Dominikus Scherkl: "AW: Ligatures fi and ffi"
Maybe reply: James Kass: "Re: Ligatures fi and ffi"
Reply: Philippe Verdy: "Re: Ligatures fi and ffi"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

(I took the liberty of changing the Subject, since this isn't really about
"Glagolitic in Unicode 4.1" any more.)

On Tue, 31 May 2005, Philippe Verdy wrote:

> From: "Страхиња Радић" <vilinkamen@mail.ru>
> > By using this kind of reasoning, we would end up asking why the heck
> > was ``fi'' or ``ffi'' encoded when these two can be expressed with their
> > corresponding atoms
>
> Today, they would not be encoded.

I think they would be encoded even today, due to their presence in other
character codes. But they would not be encoded, and would not have been
encoded, without such background.

> - - ligature processing is a required feature to support
> even legacy ISO 8859 charsets like Arabic, or Indian standard charsets
> (ISCII).

Pardon? In which sense is ligature processing _required_? Do you mean that
it is forbidden now to render "f" followed by "i" as two letters, without
using a ligature? I don't see how an application would even be required to
be _capable_ of using a ligature.

> Unicode however cannot remove those characters.

That's certainly true, due to the policy of never removing any characters.

> They remain there for
> compatibility, they are not recommanded,

Is there any explicit statement in the Unicode standard that says that the
ligatures should not be used?

> they are considered compatibility
> characters with canonical decompositions,

No, characters like U+FB01 LATIN SMALL LIGATURE FI have _compatibility_
decompositions. This means that replacing a ligature with the
decomposition may remove formatting information - as it surely does.

> and not part of normalized forms,

They are part of normalization forms C and D, which involve canonical
decomposition but not compatibility decomposition.

> because their plain-text semantic is strictly equal to the semantic of their
> decomposition in any human languages that use them.

There is a difference in meaning as regards to rendering: U+FB01 very
clearly says it's a ligature, whereas "fi" may or may not be rendered as a
ligature.

-- 
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/

Next message: Dominikus Scherkl: "AW: Ligatures fi and ffi"
Previous message: Michael Everson: "Re: browser encoding settings"
Next in thread: Dominikus Scherkl: "AW: Ligatures fi and ffi"
Reply: Dominikus Scherkl: "AW: Ligatures fi and ffi"
Maybe reply: James Kass: "Re: Ligatures fi and ffi"
Reply: Philippe Verdy: "Re: Ligatures fi and ffi"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Jun 01 2005 - 06:21:14 CDT