From: Philippe Verdy (email@example.com)
Date: Thu Jul 10 2003 - 15:34:38 EDT
On Thursday, July 10, 2003 8:37 PM, Kenneth Whistler <firstname.lastname@example.org> wrote:
> Peter Kirk asked:
> > > In Turkish and Azeri the sequences f - i and f - dotless i both
> > > occur, and are fairly frequent. So it is inappropriate in these
> > > languages to use fi ligatures in which the dot on the i is lost
> > > or invisible, at least where the second character is a dotted i.
> > > Has any thought been given to this issue? Is it possible to block
> > > such ligation on a language-dependent basis?
> and Philippe Verdy responded with another question:
> > Isn't there a "Grapheme Disjoiner" format control character to
> > force the absence of a ligature like <fi>, i.e. <f, GDJ, i>?
> The answer to Philippe's rejoinder question is no, there is not
> a "Grapheme Disjoiner" format control character.
I did not refer to a specific unicode character, I knew that there
is one already dedicated, but I did not want to comment about
There's no contractiction. The Grapheme Disjoiner, for you is
And I did not want to promote any change in any legally and
lecacy encoded text, only to suggest ways to solve the
apparent rendering problem in Turkish, when the <f, i>
encoded character pair may be badly rendered. For the actual
rendering, selecting a <fi> ligature is not appropriate for
Turkish, and in fact the canonically decomposed character
has no linguistic ambiguity in Turkish.
So what ever the <fi> encoded codepoint designates, it is not
the <fi> ligature glyoh but really two characters, whose ligation
may still be performed according to language context.
A font that would automatically select a <fi> ligature to represent
a sequence of <f, i> codepoints, from the fact that the <fi>
codepoint is canonically equivalent is probably defective and not
conforming. Such selection of ligature must be put under the
control of the renderer with additional markup, which can in fact
select among three ligatures in Turkish: the <fi> ligature glyph
where the f is ligated with the dot above i (normal ligature for
languages other than Turkish/Azeri, the <f-dotted-i> and
<f-fotted-i> ligatures for Turkish/Azeri.
Markup is necessary to select the appropriate glyph, or this
can be selected by using the "Grapheme Disjoiner" (ZWNJ)
or the "Grapheme Joiner" (ZWJ) in addition to the use of
a <i> or <dotless-i> codepoint eventually followed by the
<i-above> diacritic. All this enrichment of text is assumed
to be under the control of the markup added to the original
text which does not need to specify whever ligatures should
or should not be used.
This archive was generated by hypermail 2.1.5 : Thu Jul 10 2003 - 16:14:12 EDT