Re: Unihan.txt and other possible representations of the data

From: Ernest Cline (ernestcline@mindspring.com)
Date: Wed Apr 21 2004 - 19:52:41 EDT

Next message: Theo Veenker: "Re: Suggestion: use of symbolic links in the FTP site"

Previous message: jcowan@reutershealth.com: "Re: Your product"
Maybe in reply to: Gary P. Grosso: "Unihan.txt and other possible representations of the data"
Next in thread: Markus Scherer: "Re: Unihan.txt and other possible representations of the data"
Reply: Markus Scherer: "Re: Unihan.txt and other possible representations of the data"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> [Original Message]
> From: Tom Emerson <tree@basistech.com>
> To: Gary P. Grosso <ggrosso@arbortext.com>
> Cc: <unicode@unicode.org>
> Date: 4/21/2004 12:58:38 PM
> Subject: Re: Unihan.txt and other possible representations of the data
>
> Gary P. Grosso writes:
> > There may be value in an HTML representation, utilizing links
> > and multiple files. What would the logical division(s) be?
> > Or has this already been done?
>
> I'm working on a proposal for generating different representations of
> Unihan, and this includes logical divisions. I'll post a draft when I
> have something ready.

The obvious division is to put the dictionary stuff in one document
(or group of documents) and to put the encoding equivalencies in
another document, and the numeric information in a third.

However, if backward compatibility could be sacrificed there would
be an easy way to shave 2 MB off the size of Unihan.txt: get rid of
the initial "U+". It may be only 10%, but its an irritating 10% because
it's totally worthless. Altho, removing it wouldn't do much to shave
the size of Unihan.zip, , because since it is so redundant, any good
compression scheme is able to take advantage of it.

Next message: Theo Veenker: "Re: Suggestion: use of symbolic links in the FTP site"
Previous message: jcowan@reutershealth.com: "Re: Your product"
Maybe in reply to: Gary P. Grosso: "Unihan.txt and other possible representations of the data"
Next in thread: Markus Scherer: "Re: Unihan.txt and other possible representations of the data"
Reply: Markus Scherer: "Re: Unihan.txt and other possible representations of the data"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Apr 21 2004 - 20:41:23 EDT