From: Eric Muller (firstname.lastname@example.org)
Date: Sat May 17 2008 - 18:25:33 CDT
Jonathan Pool wrote:
> I was hoping to use documents at the Unicode Web
> site, including translations of “What is Unicode” and of UDHR, as guides for
> some languages, but many of the documents seem to contain U+0027 APOSTROPHE
> where my reading of the standard says other characters are preferred. I’m
> curious about the reason.
One of the goals of the UDHR in Unicode project is indeed to show (via
the translations themselves) and to document (via the notes) what could
be called "best practices for Unicode use".
Most of the texts have been "rescued" from the UN site, and the work so
far has mostly been to put them in a uniform format, and to clean up the
most obvious encoding problems; also to locate and incorporate existing
translations. That can be done without too much knowledge of the
languages. For the next step, which is to clean up the character usage,
fix the typos and complete the partial translations, we really need help
from readers of the languages. We have received significant help for a
few languages, but none for most of them.
In the particular case of U+0027, and for the UDHR, most of the uses
should probably either U+02BC ʼ MODIFIER LETTER APOSTROPHE or U+2019 ’
RIGHT SINGLE QUOTATION MARK. Lorna Priest kindly sent me a list of
languages which are known to use an apostrophe to write a glottal stop,
but I have not had time yet to fold that in the texts.
Help is always welcome; the preferred form is a replacement XML text
(from which the other representations are generated), together with
enough information to justify the changes (I still hope that we will be
able to feed back the texts to the UN, and that will certainly require
some level of traceability). Of course, additional translations are also
This archive was generated by hypermail 2.1.5 : Sat May 17 2008 - 18:31:33 CDT