Handling obsolete codes

From: RICK KUNST (rkunst@mercury.acpub.duke.edu)
Date: Sun Apr 21 1996 - 17:07:27 EDT


What is the proper way for an application to handle obsolete codes
(say, Unicode 1.0 codes) when it opens a document produced by another
application?

We recently had a user of our UniEdit editor note problems she was
having in displaying Greek breathing marks in a text which had been
created using another application which supports Unicode. When I
examined the text, I discovered that it contained the invalid codes
U+0371, U+0372 and so on, which were combined in Unicode 1.1 with
U+0313, U+0314 etc. After we ran the document through a Unicode
1.0-to-Unicode 1.1 conversion filter, they were rendered properly.
But I am uncertain whether in general it is best (1) to retain the
original codes, as we did, and display them with an unknown character
symbol (in our case a bullet); or (2) test automatically for such
invalid codes and display a warning (which we do not do), and convert
them to valid codes if possible upon opening the document; or (3)
retain the original codes, but render them to the extent possible as
if they were valid codes?

This question becomes more important for us because at the moment we
have opted for a different solution for the transition to Unicode 2.0
for Korean Hangul. We use alternative (3) above for rendering Korean
Hangul regardless of whether they are in the Unicode 1.1 or Unicode
2.0 codepoints. That is, at the moment we display identically both
Unicode 1.1 Hangul characters and Unicode 2.0 Hangul characters and
the user has no way of knowing which he is seeing (unless he uses
Edit, Find). But for active typing in Korean we still only use
Unicode 1.1 throughout our applications--otherwise it would be even
more confusing.

At some point we will switch over to actively typing Hangul using
Unicode 2.0 codepoints, and at that point there is much to be said
for handling Unicode 1.1-coded Hangul with alternative (2),
attempting to "upgrade" the user's files semi-automatically.

I welcome advice from others regarding this question.

Rick Kunst
 
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
Humanities Computing Facility Tel. (919) 660-3194
015 Language Center - Box 90269 Fax: (919) 660-3191
Duke University E-mail rkunst@acpub.duke.edu
Durham, NC 27708 USA http://www.lang.duke.edu
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT