RE: The mother of all collation schemes

From: Michael Kaplan (Trigeminal Inc.) (v-michka@microsoft.com)
Date: Thu Jun 15 2000 - 16:30:31 EDT


Well, along with posts made to the list earlier, there is the problem of
languages that may have native speakers who are unhappy with your collation
scheme. Period. In my experience the fastest way to piss off a user is to
refuse them the right to see things sorted as they would prefer.

But the general rule is that many of the preferred collating orders of
various languages BLATANTLY contractdict each other and thus cannot be
reconciled.

I guess a lot of it depends on where the list is going to end up.

Michael

> ----------
> From: rampshot@usa.net[SMTP:rampshot@usa.net]
> Sent: Thursday, June 15, 2000 1:11 PM
> To: Unicode List
> Subject: The mother of all collation schemes
>
> I am trying to think of a collation scheme for the purpose of ordering a
> set
> of CDs. Let's say you have CD titles you want to order. They are in
> different
> languages, with a few accented letters, and even some non-Roman letters.
>
> 1) Romanise all non-roman names. For Japanese, I'd use "fu" and "chi" and
> "shi" and "tsu" and DEFINITELY indicate long vowels (so Tokyo would come
> out
> as "Toukyou").
> 2) My alphabetical order: (digits are treated as letters):
> [sp] [other punc.] 0 1 2 3 4 5 6 7 8 9 A B C D E F G H
> J K
> L M N O P Q R S T U V W X Y (why couldn't I find this in
> uppercase?) Z
> The reason digits are treated as letters is so "97" will come before "98".
> I'm not sure how to treat names like "Ranma 1/2". Any ideas? Also, this
> system
> is very sensitive to things such as misspelling "DJ" as "D.J."
> Does anyone have any ideas for ordering punctuation?
> Of course, if it was just anime CDs the order would be 0 1 2 3 4 5 6 7 8 9
> a i
> u e o ka ki ku ke ko sa shi su, etc.
>
>
> ____________________________________________________________________
> Get free email and a permanent address at http://www.netaddress.com/?N=1
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:03 EDT