RE: dotless j

From: Michael Everson (everson@indigo.ie)
Date: Mon Jul 05 1999 - 07:34:07 EDT


Ar 13:16 -0700 1999-07-04, scríobh Paul Dempsey (Exchange):
>My, my. The assumptions of 1-1 character-glyph mapping run deep! :-)
>
>Nobody has shown evidence of the independent, isolated character dotless j.
>If there is evidence of the character, then there is justification for
>encoding.

That's not logical. We have LATIN SMALL LETTER I which is defined as a
character that can stand alone or can take diacritics. We have LATIN SMALL
LETTER DOTLESS I which is defined as a character that can stand alone.

We could easily say that LATIN SMALL LETTER J is defined as a character
that can stand alone, and LATIN SMALL LETTER DOTLESS J which would be the
character to which U+0135 and U+01F0 and Pullum & Ladusaw's HOOKTOP J map.
Possibly Ken would say that such a decompositional change (for the first
two) would be too expensive.

We would also then have a proper mapping for IPA fonts. In SIL's IPA fonts
for the Mac, for instance, U+006A is found at x6A, U+02B2 is found at x4A,
and DOTLESS J is found at xBE. I am sure that a review of other

>When a rendering engine sees j followed by a a combining mark, it either
>finds a precomposed glyph for that combination, or it takes the j-base glyph
>(aka dotless j) and adds the combining mark.

Neither of which, of course, is present in your font.....

>A font could even have no glyph
>at all for dotted-j, and always create it by composition (same for i). There
>is no necessity for the rendering process to remove parts of a glyph. There
>is no necessity for Unicode to provide codepoints for glyph fragments that
>otherwise have no independent existence.

I fail to see the advantage to mapping U+0135 and U+01F0 and HOOKTOP J to
U+006A when mapping them to DOTLESS J would be more practical, and would
map more neatly to existing (legacy) IPA fonts. These phonetic characters
have their own semantics; the semantics of the underlying "J-ness" is
secondary.

>A similar process must be applied to render other scripts correctly: a given
>combining mark changes the shape of the base character (adding or removing
>parts, changing it's height, or whatever). J is not unique in this regard,
>and deserves no more special treatment than characters in any other script.

No. DOTLESS J is a marginal, but important part of the Latin writing system
as used in phonetics and transliterations. So it is "left out". The kinds
of glyph transformations required for Nagari or Arabic are canonical,
well-known, and universally implemented.

I note that in L2/99-159, a request from STIX Project of the STIPUB
Consortium (a consortium of scientific societies and scientic/technical
publishers), the character SMALL LETTER J, NO DOT is requested at position
6x11.

--
Michael Everson * Everson Gunn Teoranta * http://www.indigo.ie/egt
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Guthán: +353 1 478 2597 ** Facsa: +353 1 478 2597 (by arrangement)
27 Páirc an Fhéithlinn;  Baile an Bhóthair;  Co. Átha Cliath; Éire



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT