RE: Encoding of personal names in official databases

From: Jonathan Rosenne (rosenne@qsm.co.il)
Date: Tue Mar 30 1999 - 08:04:32 EST


Don't forget that the names must not only be writable, they must also be
readable by the officials and others who use this data, and printable on the
equipment they have. So I suggest you restrict yourself to the Latin script
as used in Europe.

Jony

> -----Original Message-----
> From: Trond Trosterud [mailto:Trond.Trosterud@hum.uit.no]
> Sent: Tuesday, March 30, 1999 1:16 PM
> To: Unicode List
> Subject: Encoding of personal names in official databases
>
>
> Within the next month, I am going to write a memo to the
> Norwegian dept. of
> justice to comment upon the planned revision of the Norwegian laws for
> personal names. The goal of the revision is to allow other naming
> practices
> than the Norwegian one, due to a culturally more heterogenous population.
>
> My input will deal with the encoding of the names.
>
> Today, the official Norwegian population registry is coded with ascii,
> enriched with the norewegian letters on the ascii positions [\]{|}
> (I guess the same solution is in use in Denmark, Sweden and Finland as
> well, but with for ).
>
> My suggestion will be that they abandon their 7-bit systems and move to...
>
> and here I need your advice.
>
> In Norway, Smi citizens use Smi names, the diacritics (ACUTE ACCENT,
> CARON, HOOK, STROKE) are just stripped off in the registry. We have large
> amounts of Finns and Swedes, their are replaced with .
> Immigrants from
> other countries bring their letters (and alphabets) with them. A natural
> answer to this is of course: Use the UCS. But the bases are huge: Every
> single citizen is iincluded.
>
> Do anyone on this list have experiences with similar cases? What is being
> done around the world? Do other countries use 7-bit solutions as well? Are
> there plans to migrate to 8 bits? 16 bits?
>
> Since we need both the Smi names and the names of new immigrants, 8 bits
> really are not enough. If we then use some UCS format, which one shall we
> use (16-bit, utf-8,... , in order to save space and have databases with
> fast retrieval?
>
> Greetings,
>
> -------------------------------------------------------------------
> Trond Trosterud t +47 7764 4763
> Lingvistisk institutt, Det humanistiske fakultet h +47 7767 3639
> N-9037 Universitetet i Troms, Noreg f +47 7764 4239
> Trond.Trosterud@hum.uit.no http://www2.isl.uit.no/trond/index.html
> Test string-please ignore:ᄘ--⡥-™--
> -------------------------------------------------------------------
>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:45 EDT