Re: dotless j

From: Timothy Partridge (timpart@perdix.demon.co.uk)
Date: Tue Jul 06 1999 - 15:06:51 EDT


John Cowan recently said:

> Timothy Partridge wrote:
>
> > In England in the 11th and 12th centuries i was written without a dot. It
> > was common to write the i at the end of a word in a long form which looks
> > like a dotless j. (But it was still an i and did not represent a different
> > sound.)
>
> Pretty good evidence that all these are presentation forms, not
> distinct characters, I'd say. Dotless i is a *character* only
> in Turkish; dotless j (or long i) is not a character anywhere.

I omitted to say that my reference also says that long i was *sometimes* used
at the start of a word. I would argue that making it a presentation form is
problematic for the following reasons:

Persons transcibing historic documents onto computer wish to record what the
scribe actually wrote. (Scanned images are fine for some purposes, but
linguistic analysis needs a text file.) Writers have their own quirks and a
fixed presentation form rule like "make a final dotless i a long one" will
impose a 'standard' form where one did not exist in the original document.
The convention of using dotless forms in numbers suggests that the writers
drew a distinction between the forms to aid readers. Dotless long i may not
be a character now in England but I would say that it used to be, and
Unicode should support it as a historic character. (Marion Gunn may also
have views for it being a current character.)

Adding new presentation rules to Unicode for an existing character like
dotless i will iritate the existing users, in this case the Turkish who
would no doubt be delighted to see their dotless i's mutating. The
alternative of a second dotless i character which does change form seems a
worse alternative than a dotless long i. Do you know why the long s is a
separate character in Unicode? Is it in some older standard (as opposed to
integral sign which has similar roots)?

> > If you are feeling really keen you can add a compatibility decomposition for
> > U+0079 of <ligature> dotless i, dotless long i and a comment not equal
> > U+0133 LATIN SMALL LIGATURE IJ. Can anyone think of an English word with the
> > sequence ij in the same syllable? I think they have (almost?) all become y's
> > especially the ii sound at the ends of words.
>
> That sounds like Dutch, not English. Do you have evidence that any
> real English word was ever written with either ii or ij, excluding
> the Roman numerals?

I was being a bit tongue in cheek. I can't think of an English word, but
in the period in question Latin was common (and French). -ius words become
-ii in the plural. If you have a copy of Cappelli's "Dizionario di
Abbreviature latine ed italiane" there is an example on page 2 of alii where
there are two i's in superscript over an a and the second i is long (It
looks rather like a superscript y but without a curl in its tail).

> > If this character does go into the standard I think that dotless long i is a
> > better name than dotless j, since it wasn't used in the same way as a modern
> > j.
>
> Historically, "j" *is* nothing but a long "i", iust as "w" is historically
> nothing but "uu" as uuell. The *glyphs* "j" and "w" were around long
> before they became separate *characters*.

I agree that j was originally a long i, and my naming suggestion was
intended to prevent misuse. When does a glyph become a character? I would
say if the users are making a distinction in meaning when they use it.
(Which is why I felt the numerical use was significant. Otherwise my immediate
reaction would be "use a suitable font".)

   Tim

-- 

Tim Partridge.

Any opinions expressed are mine only and not those of my employer



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT