Unicode Conformance #2

From: David Craig (doc@ElSegundoCA.NCR.COM)
Date: Mon Oct 24 1994 - 17:54:36 EDT


This is our second posting concerning Unicode Conformance:

In our environment we have a Unicode processing engine which is a relational
DBMS. It is primarily concerned with the correct comparison and collation of
text elements, but provides no glyph processing capability. Unicode is used
as a canonical representation in the backing store and also as an external
character set to the DBMS. Other external encodings (e.g. Japanese EUC, SJIS
IBM Host DBCS) are also supported and canonicalized with Unicode.

In the typical Unicode conformance paradigm, our DBMS receives Unicode from
a client, processes the data, and returns Unicode text to the same client.
The conformance requirements are well defined in this case.

Now, if one of the client's is not Unicode based, issues arise relating
to conformance. For example:

1) A Unicode client stores JIS X 0212 ideographs in the DBMS. A SJIS client
   attempts to retrieve them (SJIS does not contain JIS X0212) and is returned
   a replacement character. A round-trip conversion, SJIS to DBMS to Unicode
   client, is done which returns a replacement character (U+FFFD) to the
   Unicode client.

2) An IBM Host DBCS Application attempts to store a character with no
   correspondent Unicode mapping.

What action would a Unicode conformant processing engine take in the above
senarios? Should the translations be rejected? Should replacement character
translation occur?

+-------------+------------------------------------+-------------------------+
| AT&T | David O. Craig | Phone: (310) 524-7769 |
| Global | Internationalization Group | Fax: (310) 524-5517 |
| Information | Teradata Decision Enabling Systems | Office: 17-144 |
| Solutions | 100 N. Sepulveda Blvd. | doc@elsegundoca.ncr.com |
| | El Segundo, Ca. 90245 | |
+-------------+------------------------------------+-------------------------+



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:32 EDT