Once again, if collation info is what you want, see
Beyond that, it is unclear what you are looking for, really. But if you were
to actually read and try to understand that document, I am fairly certain
that one of two things will happen:
1) You will find the answer to your question, or
2) You will be able to frame the question more clearly
I am betting on #1, actually, as the most likely outcome. :-)
Trigeminal Software, Inc.
----- Original Message -----
To: "Unicode List" <firstname.lastname@example.org>
Sent: Thursday, August 10, 2000 12:56 PM
Subject: Mixing alphabets (was: sorting my CD collection)
> You have a good point: .... does nu-alpha-tau-alpha-sigma-alpha
> spell "Natasa" or "Natasha"? The Greek letters given
> are obviously an attempt to write "Natasha" in Greek,
> but they romanize to "Natasa".
> And a, b, c, d, e, f, g, h, ... HATES a, i, u, e,
> o, ka, ki, ku, ...
> Maybe I should just capitalize everything (except
> Georgian? ... not that I have any Georgian CDs, or
> am likely to... I bet few things would be rarer than,
> say, a Georgian female rap CD in the US!!) and from
> there, just sort by codepoint number... no good,
> "Á" would come after "Z"...
> Would somebody PLEASE tell me, IN THE DEFAULT UNICODE
> COLLATION ALGORITHM, WHAT COMES AFTER WHAT?! I could
> use a list of Unicode characters in proper collation
> order, with "ties" labeled!!
> Robert Lozyniak
> Accusplit pedometer manufactures can go suck eggs
> My page: http://walk.to/11
> email@example.com - email
> (917) 421-3909 x1133 - voicemail/fax
> ---- Antoine Leca <Antoine.Leca@renault.fr> wrote:
> > Robert Lozyniak wrote:
> > >
> > > How do you sort text with some in Roman and some
> > > in non-Roman alphabets?
> > I never sort texts, only lists of items (words,
> > names, titles, whatever).
> > Depending of the ratios, I see two main solutions:
> > - if Latin is the most current, _and_ only other
> > Greek-
> > derived scripts are used, _and_ the intended audience
> > is proficient enough, I may interspeed the non-Roman
> > letters as if all the Greek-derived alphabets shared
> > a common order (so Greek alpha sorts just after
> > Latin a,
> > Cyrillic ve after Cyrillic be which follows Greek
> > beta
> > which follows Latin b, Greek xi after the o's and
> > before
> > the p's, etc.)
> > - in other cases, I sort the scripts separately.
> > > Currently, I'm just romanizing
> > > everything but I don't know if that is that good.
> > Hmmm. I won't do that. It would take me much too
> > long
> > to find something that begin with beta at the V
> > section,
> > while something that begin with mu+pi at the B
> > section...
> > For Cyrillic, I expect U+0427 to romanize as tcha,
> > and U+0429 as chtcha, and I am not sure you will
> > (or
> > vice-versa).
> > Things are different if you actually translitterate,
> > i.e. if the items are presented in Latin script.
> > > It is probably bad to kanize digits, because
> > they
> > > would sort 1, 9, 5, and so on, or some other
> > mixed-up
> > > order.
> > It is always a problem to sort the digits, anyway.
> > Since they are usually ony a few of them, I believe
> > the
> > best place is the foremost, so the search does
> > not takes
> > too long. But if they are more than a bunch, that
> > is
> > pretty always a brain damage.
> > Antoine
> Get your own FREE Bolt Onebox - FREE voicemail, email, and
> fax, all in one place - sign up at http://www.bolt.com
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT