Re: Encodings for SQL Databases

From: addison@inter-locale.com
Date: Mon Aug 07 2000 - 17:59:16 EDT

Next message: Michael \(michka\) Kaplan: "Re: Encodings for SQL Databases"
Previous message: Michael \(michka\) Kaplan: "Re: Encodings for SQL Databases"
Maybe in reply to: Peck, Jon: "Encodings for SQL Databases"
Next in thread: Michael \(michka\) Kaplan: "Re: Encodings for SQL Databases"
Reply: Michael \(michka\) Kaplan: "Re: Encodings for SQL Databases"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> >
> > b) Twelve characters, at least six of which would have unknown sort
> > characteristics (since the first two bytes of a surrogate would not have a
> > defined sort order and the second two byte which might randomly coincide
> > with an existing BMP value when treated as a separate Unicode code point.
> >
Actually, the way surrogates work is: one high surrogate followed by one
low surrogate. The second value would never, ever, coincide with a valid
character (in the same way that bytes in UTF-8 multibyte characters never
collide with valid ASCII values).

So (b) should read:

Twelve characters, all of which have unknown sort characteristics and each
of which is treated as a separate Unicode code point.

This is, I believe, what SQL Server 7.0 actually does: it is surrogate
unaware.

Thanks,

Addison

Next message: Michael \(michka\) Kaplan: "Re: Encodings for SQL Databases"
Previous message: Michael \(michka\) Kaplan: "Re: Encodings for SQL Databases"
Maybe in reply to: Peck, Jon: "Encodings for SQL Databases"
Next in thread: Michael \(michka\) Kaplan: "Re: Encodings for SQL Databases"
Reply: Michael \(michka\) Kaplan: "Re: Encodings for SQL Databases"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT