Re: orthographic characters for glottal stop

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Sep 07 1999 - 13:11:39 EDT


Peter continued:

> PC> What about for those cases where the glottal is
> written using
> >> the same shape as (European) digit 7? [...] I
> haven't found
> >> anything in the standard that fits (and I think it's
> an option
> >> to say, "change all your literature to use a true
> glottal stop
> >> glyph").
>
> JC>I suppose you mean "I don't think it's an option"?
>
> Yes, of course.

I think you should consider further. First of all, if you are moving
to Unicode for these, any existing data is going to have to be
converted. Second, the use of 7 was an ugly hack in the first place,
and it isn't doing anyone any favors to perpetuate it or the other
digit hacks for letters from typewriter days. An orthography that
mixes the digits with the letters is just going to continue to run
into algorithmic problems as it moves into computerized form.

This doesn't mean going back and burning all the material that was
printed with the "7", but introduction of a proper glottal stop shouldn't
be that difficult, if the benefits are explained. (It can even be
made to look "sevenish" in your fonts, if users insist.)

>
> PC>> Do we need to add a character LETTER GLOTTAL 7 with
> PC>> category Lo and bidi property L?
>
> JC>I think that this 7 is just a glyphic variant (caused by
> restricted
> >fonts) of LATIN LETTER GLOTTAL STOP, just as 3 is a variant of
> >LATIN LETTER YOGH and 8 is a variant of LATIN LETTER OU (in
> Unicode
> >3.0). A suitable typewriter font might render GLOTTAL STOP
> with the
> >same glyph as 7, but internally it should use the correct
> character.
>
> That seems reasonably obvious, and I don't know why I didn't
> think of that. However, there's still a question here: if the 7
> used for glottal stop is just a glyph variant of U+0294, then
> why wouldn't we also consider the same to be true in the case
> of the right singly curly quotation mark glyph used for glottal
> stop? I.e. why is it that, on the one hand, we encode the
> orthographic character that looks like ' and that typically
> represents the phoneme glottal stop as a separate encoding
> character using separate codepoint, U+02BC; but, on the other
> hand, we treat the orthographic character that looks like 7 and
> which typically represents the same phoneme as a glyph variant
> of the IPA symbol for the phone glottal stop. There seems to be
> some inconsistency here.

The reason for having both U+0294 and U+02BC is that there are orthographies
where either can be the correct form for the glottal stop. U+02BC is
also intended for representation of a spacing "glottalization" mark as well.

The difference from the "7" is that that was always known to just be
a workaround for keyboards that had no glottal stop. The same problem
arises for orthographies that used "?" for a glottal stop. Neither should
be perpetuated into the future because of the problems such overloading
will cause.

--Ken

>
> Peter
>
>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT