Re: Pau Cin Hau scripts proposal : confusive N3865 and better older N3781

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Jul 20 2010 - 22:31:13 CDT

Next message: Nishan Naseer: "Re: Indian Rupee Sign (U+20B9) proposal - copyright/licencing issue"

Previous message: Kenneth Whistler: "Re: Pau Cin Hau scripts proposal : confusive N3865 and better older N3781"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> Message du 21/07/10 04:11
> De : "Kenneth Whistler" <kenw@sybase.com>
> A : verdy_p@wanadoo.fr
> Copie à : unicode@unicode.org, kenw@sybase.com
> Objet : Re: Pau Cin Hau scripts proposal : confusive N3865 and better older N3781
>
>
> Philippe Verdy said:
>
> > A side note about this preliminary proposal for allocating blocks in
> > the SMP for the two Pau Cin Hau scripts (including one for the large
> > "logographic" script, with 1050 signs):
> >
> > http://std.dkuug.dk/JTC1/SC2/WG2/docs/n3865.pdf
> >
> > (authored by Anshuman Pandey, in MIT)
> >
> > If the non-logographic Pau Cin Hau script (currently counting 57 signs
> > in this preliminary report that does not give its sources and does not
> > give examples)
>
> Those are given in the earlier N3781, which itself is cited in
> this short document. N3865 also indicates that "A formal proposal
> for the Pau Cin Hau Syllabary will be submitted shortly." That is
> the notice that a revision of N3781 will be forthcoming, with
> more details. N3781 was clearly labelled "Preliminary", and the
> author has worked extensively on it since February.
>
> N3865 *only* specifies the sizes of the anticipated repertoires
> to encode, as guidance to the Roadmap Committee for UTC and WG2.

That's exactly what I understood, but this was already anticipated in
the earlier document, which gave the same estimation for the
repertoire of the two scripts. but probably this small PDF was
composed too fast. This does not remove themerit of the work already
performed in N3781 and N3784 (for tone/length marks used in the small
script, but possibly used as well on the large script).

And it does not offer a clear guidance for the large script (is it
really logographic? I have serious doubts, even if this may apparently
look logographic in its presentation, only because of its relatively
small size, and in fact it may already contain what was later
formalized by the smaller script that derived from it, by systemizing
the notation of syllables with distinctive letters made from some
common traits).

The prior document however contained an interesting remark : the
smaller alphabetic script still has a modern use, and it could fit in
the BMP. N3865 still does not answer to that question : BMP or SMP for
the smaller script (that easily fits in two columns) ?

But the BMP Roadmap is now almost fully allocated, and the religious
community using the small script is very small (and living in a
country where communication is not easy). Given that the ISO 10646
"implementation levels" are about to be abandonned (because everyone
now implements only the "level 3" that requires the full support of
supplementary planes), this may not be so important to fit the small
alphabetic script in the BMP (even if it's clear that the large script
will go to the SMP).

But I still think that the wording in N3865 just gives confusion about
the nature of the two described scripts, that should even be treated
separately and don't need to be encoded at the same time. Further
research is needed for the large script about how it really works,
because I've not found any example of it for now.

Some interesting reading, showing actual examples not found in N3781
for the small alphabetic script (and romanizations of composed
syllables):

http://www.scribd.com/doc/3852585/Pau-Cin-Hau-Lai

The small script is probably one of the most regular and systemic
found. This means that its implementation should be very easy.

And sorry about my wording what I said that "long s" is "deprecated" :
it is of course not within Unicode (and it is used in many documents
when they need to show the distinction or need to reproduce medieval
texts), but it is really deprecated within the orthographies of
modern languages, that no longer attach an importance to the
distinction between "small s" and "long s". The Latin script as it is
used now (as well as the Greek script) does not differentiate clearly
all the coda consonnants from the initial consonnants that are needed
in many languages ; the distinction was already confusive and
extremely irregular, even from the same authors or the same typists,
when they were frequently used in medieval texts.

Next message: Nishan Naseer: "Re: Indian Rupee Sign (U+20B9) proposal - copyright/licencing issue"
Previous message: Kenneth Whistler: "Re: Pau Cin Hau scripts proposal : confusive N3865 and better older N3781"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Jul 20 2010 - 22:34:13 CDT