Re: phonetic superscripts, etc. (was Re: Superscript asteris

From: Peter_Constable@sil.org
Date: Sat Jul 03 1999 - 13:03:22 EDT


>Well, I was thinking of a different context, that of including IPA within a
document along with other writing systems. There is nothing preventing you from
defining your own character usages, including treating a single Unicode
character in different formats as different characters within your domain. We
mathematicians do it all the time. One fairly common example is using plain and
bold letters for different but related objects such as vectors and tensors.
Again I say, (IME ~= file) ^ (file ~= output), and not only that, but character
encoding is not character semantics.

I've been preaching (IM ~= file) ^ (file ~= output) for some time, so I'm all
with you there. If all that's involved here is entry and display, then it's not
hard for me to build a font with as many presentation forms as I need, PUA
allocations as needed, and an appropriate IM. I can even apply multiple fonts or
other formatting if needed. The point is, that's not all that is needed. In the
following model of text processing:

        ------- -----------
       | INPUT | | RENDERING |
        ------- -----------
                 \ /
                    ----------
                   | ENCODING |
                    ----------
                 / \
    ------------ ----------
   | CONVERSION | | ANALYSIS |
    ------------ ----------

you've mentioned the top half. The part that concerns me most is analysis.

>Anyway, one of the proposed advantages of an XML scheme is that it can be made
as general as the subject matter allows and requires. You could set it up to
handle all of the variations of superscripts, small caps, and much more, once.
Then you could create hundreds, nay, thousands and myriads of new combinations
without further ado, and without having to come back to the Unicode and ISO
comittees each time for the protracted process of registration.

That's true. However, it's not just what analysis I want to do on the text that
concerns me. It's the hundreds of other linguists I work with that aren't as
computer savvy. They want to be able to do all kinds of things on their IPA text
in a way in which semantics, not entry or appearance, is the whole point.
They're used to tools that work on character string. Teaching them to parse XML
is forcing them to bend to fit limited technology rather than to develop
technology to meet their needs.

For at least the same reasons that there is interest in extending Unicode to
meet the needs of mathematicians, linguists would benefit from extending Unicode
to meet the needs of phonetic/phonemic transcription. After all, transcribed
language is a form of writing, a form of text, and the whole point of Unicode is
to provide a single standard for encoding of text.

Peter



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT