Re: utf-8 and databases

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Mon Jul 08 2002 - 02:58:07 EDT

Previous message: John Cowan: "Re: Keyboard entry on PCs"
In reply to: Paul Hastings: "utf-8 and databases"
Next in thread: Tex Texin: "Re: utf-8 and databases"
Reply: Tex Texin: "Re: utf-8 and databases"
Reply: Paul Hastings: "Re: utf-8 and databases"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

At 02:11 PM 7/7/02 +0700, Paul Hastings wrote:
>is there a standard test that can determine whether a given
>database can handle utf-8 (ie as "native" utf-8 not converting
>to ucs-2 or whatever)?

Why is that of any interest?

The primary concern is whether a database is able to represent the entire
repertoire of Unicode. Just create a string that contains the largest
character 0x10FFFD, convert it to whatever encoding form the APIs require
and see whether you get it back unmolested.

A more sophisticated test would take a longer string and attempt to sniff
out incorrect truncation of characters.

A secondary concern is performance. If the choice of encoding form is a
poor match for the actual data encountered, and if entering and retrieving
the data requires too many transcoding steps, it's conceivable that this
could be detected in the overall performance of the database.

However, there's no reason to assume that a theoretical match in encoding
efficiency translates automatically into a more efficient database
implementation.
Therefore, regular benchmarking tools should be fine to determine database
performance, as long as the test data is representative for the installation.

A./

Previous message: John Cowan: "Re: Keyboard entry on PCs"
In reply to: Paul Hastings: "utf-8 and databases"
Next in thread: Tex Texin: "Re: utf-8 and databases"
Reply: Tex Texin: "Re: utf-8 and databases"
Reply: Paul Hastings: "Re: utf-8 and databases"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Mon Jul 08 2002 - 01:05:22 EDT