PCL (was RE: Use of interum PUA encodings for 85 letters)

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Oct 15 2007 - 20:46:23 CDT

  • Next message: Michael Maxwell: "RE: PCL (was RE: Use of interum PUA encodings for 85 letters)"

    Philippe responded:

    > > I suspect what the OP was talking about is this:
    > >
    > > http://pcl-institute.org/01_Mission_Statement.htm
    > >
    > > You can see a sample in the gif at the bottom of the page.
    >
    > Or even better in the next pages (follow the link at the bottom of the
    > page).

    Or just start from http://pcl-institute.org, but the other pages
    don't show actual samples of PCL or "Cyber Chinese".
     
    > Apparently, this was supposed to become a better alternative to the existing
    > Pinyin standard in China (that needs digits to represent some tones, not
    > enough to disambiguate some paratones, i.e. near homotones), and does not
    > correctly delimit polysyllabic words (but this is also true for the modern
    > Han script, notably with the simplified orthography needing less ideographs
    > reused for their phonetic value in Mandarin), or even to Bopomofo (which is
    > even less precise than Pinyin).

    *groans*

    This paragraph contains so much disinformation about Chinese, that I
    hardly know where to start.

    Pinyin does *not* require digits to represent some tones. jinmi is
    written exactly that way in Pinyin dictionaries. The fact that people
    write it alternatively as jian4mei2 in email is an artifact of
    the fact that the tone marks for 1st tone (macrons) and 3rd tone (haceks)
    aren't available in Latin-1 or Windows 1252, so don't work all that
    well in email often. But of course if you are using UTF-8, they are
    no problem at all.

    Pinyin makes all the require tonal distinctions for Mandarin
    just fine -- including the unmarked, reduced tone.

    "Paratone" doesn't mean what Philippe has said it means (namely
    "near homotones"). To see what it *actually* means, you can
    do a Google book search on Robert Trask's A Dictionary of
    Phonetics and Phonology. It is irrelevant to any discussion of PCL.

    Pinyin does correctly delimit polysyllabic words.

    And Bopomofo is not less precise than Pinyin for the representation
    of Mandarin.

    > Anyway, this alphabet with tones could still be used as a convenient way to
    > easily enter Chinese text in an input method (the syllabic clusters would
    > then be resolved into normal Han ideographs using a dictionnary). It seems
    > that it just uses existing basic radicals, for the 25 lead consonants, and
    > the precomposed 15 vowels and 4 tones (60 letters) : as they are ordered
    > logically, the first keystroke would enter the consonant letter, the second
    > one would enter the vowel+tone letter, which would be better (more precise)
    > and simpler to learn than pyinin methods (number of distinct homotone
    > groups: 1500, not counting the many ideographs for in the same homotone
    > group and that are remaining today because of their semantic difference)...

    Here Philippe has missed the main innovation of PCL as an orthography
    for Chinese -- its reuse of letters as "icons" to distinguish the
    Han ideographs (which PCL calls "ideograms") associated with any
    particular spelled-out Mandarin syllable. A more linguistically
    correct way to say this is that PCL tacks on an extra letter
    as a morpheme determinant (associated with a particular Han character),
    but economizes by leaving the morpheme value unmarked for the
    most common character associated with each syllable (based on
    a lexical frequency analysis).

    As an input method, however, I don't see any great advantage over
    Pinyin. PCL's supposed advantage, in writing each morpheme phonetically,
    but then adding an "icon" (a morpheme determinant) to associate each
    syllable correctly to its character (from among the often large
    list of homophonic Han characters), is also its Achilles heel.
    It means you have to learn all the characters anyway, and how they
    are associated with the syllables, in order to spell out the words
    correctly. This is both a larger learning task and does not take
    advantage of the relative *non*homophony of Chinese polysyllabic
    words to enable accelerated lookup for purely phonetic entry.

    Effectively, PCL is attempting -- with a de novo script whose shapes
    appear to be based in part on letter forms and in part on Chinese
    radical forms -- the equivalent of a spelling reform for English
    along the lines of:

    tu [--> to]
    tu1 [--> two]
    tu2 [--> too]

    rayt [--> right]
    rayt1 [--> write]
    rayt2 [--> rite]

    where you write with a phonemic orthography, and then the
    digits (in this made-up example) are added to the less common morphemes
    to distinguish them. The bracketed mappings are the equivalent
    of the mapping of the syllable + morpheme determinant in PCL
    to a Han character.

    From what I can tell about the system based on the description and
    the (one) example, PCL looks like a system that would be relatively
    easy for a literate native speaker of Chinese to pick up as a
    second orthography, because it assumes the associability of the
    spellings back to the characters, and because it picks letters
    for their iconicity for Chinese reader/writers. But as a system
    for teaching Chinese as a second language to non-natives, it
    pretty much looks like a non-starter to me.

    > The PCL group proposed in those papers to submit it for standardisation by
    > ISO (proposing also a 7-bit encoding for it, named "CSCII"; already too late
    > in 2006 for such standardization, except by de facto encodings) after
    > gaining some support in China or Taiwan; did this occur?

    No.

    > Is the PCL group
    > still being actively working on this Pinzi/Pinci (Spelling Chinese) script?

    Apparently, or this inquiry about use of PUA encodings for the
    system wouldn't have come up in the first place.

    > But is it needed to encode it separately?

    As a con-script being proposed for a teaching system for
    (Mandarin) Chinese, I don't see any need to standardize
    it currently. It would actually have to come into significant
    use first. With the proponents not even publishing the
    relevant details about it on the site advocating its
    use, it seems a long way from being standardizable.

    --Ken



    This archive was generated by hypermail 2.1.5 : Mon Oct 15 2007 - 20:49:21 CDT