Re: Sorting tags

From: Kaihsu Tai (kaihsu@ugcs.caltech.edu)
Date: Fri Jun 20 1997 - 18:11:08 EDT

Next message: Markus G. Kuhn: "Re: Sorting tags"
Previous message: Michael Everson: "Re: CJK tags - Fish or cut bait"
Maybe in reply to: Markus G. Kuhn: "Re: Sorting tags"
Next in thread: Markus G. Kuhn: "Re: Sorting tags"
Reply: Markus G. Kuhn: "Re: Sorting tags"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> One thing that I think would be a nice addition to the work on sorting
> Unicode is to specify somewhere in the Unicode code space a number of
> sorting control characters, that direct some preprocessing on strings
> before they actually enter the sorting algorithm.

Argh, is this supposed to be in a "character set encoding"?

> Simple example, in a list of names, the most significant letter is the
> first letter of the surname and not always the first letter. It is in
> many libraries common practice for the librarian who processes a newly
> purchased book to mark with a pencil the first letter of the surname
> such that this book is later always sorted in the same way.
>
> Let (1) be this marker, then we could store a list of names in a database=
>
> like
>
> John (1)Smith
> Mr. Joseph E. (1)Miller-Rubin, M.D.
> etc.

As Alain said once, different parts of names should be stored in different
_fields_, instead of putting these burdens onto the character set
encoding. They should be stored as something like

{
surname=Smith
givenname=John
}

{
surname=Miller-Rubin
middleinitial=E.
givenname=Joseph
academicdegree=M.D.
}

and the database program can then sort, display and manipulate the data
appropriately as the user requests.

> - suppression of substrings for sorting
> - replacing substrings for sorting (like in "Markus <G.|Guenther> Kuhn"

This should be done by the database. An entry like

{
givenname=Markus
middlename=Guenther
surname=Kuhn
}

will then be displayed by request as "Markus G Kuhn" (en-GB),
"Markus G. Kuhn" (en-US), "MGK", "Kuhn, M.G.", "Kuhn, Markus Guenther",
"Markus Guenther Kuhn", "Markus Kuhn", "Kuhn, M.", etc.

-- 
Khaisu Te (Kaihsu Tai)
http://www.ugcs.caltech.edu/~kaihsu/

Next message: Markus G. Kuhn: "Re: Sorting tags"
Previous message: Michael Everson: "Re: CJK tags - Fish or cut bait"
Maybe in reply to: Markus G. Kuhn: "Re: Sorting tags"
Next in thread: Markus G. Kuhn: "Re: Sorting tags"
Reply: Markus G. Kuhn: "Re: Sorting tags"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT