Re: Searching data: map countries to scripts

From: Philippe Verdy <>
Date: Mon, 20 Aug 2012 23:31:56 +0200

2012/8/20 Asmus Freytag <>:
> On 8/20/2012 12:04 AM, Manuel Strehl wrote:
>> Thanks for the answer.
>> It's clear to me, that I could map "Hana" and "Kata" to "US" just for
>> the sake of having a Japanese Minority in the states.

Actually the Japanese language is normally written in the "Jpan"
script (which is in fact a family of three scripts : "Hira", "Kana"
and "Hani", which are used simultaneously, but encoded separately in
Unicode; with just a few exceptions for characters used undistinctly
in the two kana scripts, notably combining characters and repeat

But some users of Japanese (or early learners of Japanese) cannot read
the Kanjis so there are Japanese texts restricted only to the kana
syllabaries, and there's the additional script code "Hrkt" (for "Hira"
and "Kana", i.e. Hiragana and Katakana). Those texts will remain
readable without much losses of semantics if kanjis are converted to

Japanese is almost never written in "Hira" alone or "Kana" alone, as
it is loosing many grammatical or semantic distinctions (exactly like
if we were using only uppercase or only lowercase letters in the Latin
script). And these two scripts do not contain exactly the same number
of "pseudo-equivalent" letters, so there will be some additional
losses when approximating one script by the other.

The same may happen for the Georgian script(s) for which there's a
need for distinctions between three scripts (in Unicode these Georgian
scripts are now partly separated in two sets, it was not the case in
the first editions which considered one of them being "uppercase" and
the other "lowercase"). Modern Georgian is written usng only the two
unified scripts (but most of the time, only one of them is used), but
the other script is still usable for traditional Georgian texts using
the "Geok" script code.
Received on Mon Aug 20 2012 - 16:34:22 CDT

This archive was generated by hypermail 2.2.0 : Mon Aug 20 2012 - 16:34:23 CDT