Re: Handling obsolete codes

From: Mark Davis (mark_davis@taligent.com)
Date: Mon Apr 22 1996 - 14:04:48 EDT


Subject: RE>Handling obsolete codes Time: 10:02 Date:
04/22/96

My recommendation would be to convert over as soon as possible. We will be
trying to leave those code-points empty for as long as possible, but the
sooner you start, the fewer problems you will have later. That is:

* convert the deprecated codes (1.0, 1.1) to 2.0 wherever possible.
  e.g. in memory on opening a document, on disk when saving versions, setting
database fields, etc.
* convert generation (e.g. keyboards, etc.) to use new codes.

This is a good topic for the next Unicode technical committee, and I will
bring this up at the next meeting.

Mark

--------------------------------------
Date: 04/21/96 17:13
To: Mark Davis
From: RICK KUNST
What is the proper way for an application to handle obsolete codes
(say, Unicode 1.0 codes) when it opens a document produced by another
application?

We recently had a user of our UniEdit editor note problems she was
having in displaying Greek breathing marks in a text which had been
created using another application which supports Unicode. When I
examined the text, I discovered that it contained the invalid codes
U+0371, U+0372 and so on, which were combined in Unicode 1.1 with
U+0313, U+0314 etc. After we ran the document through a Unicode
1.0-to-Unicode 1.1 conversion filter, they were rendered properly.
But I am uncertain whether in general it is best (1) to retain the
original codes, as we did, and display them with an unknown character
symbol (in our case a bullet); or (2) test automatically for such
invalid codes and display a warning (which we do not do), and convert
them to valid codes if possible upon opening the document; or (3)
retain the original codes, but render them to the extent possible as
if they were valid codes?

This question becomes more important for us because at the moment we
have opted for a different solution for the transition to Unicode 2.0
for Korean Hangul. We use alternative (3) above for rendering Korean
Hangul regardless of whether they are in the Unicode 1.1 or Unicode
2.0 codepoints. That is, at the moment we display identically both
Unicode 1.1 Hangul characters and Unicode 2.0 Hangul characters and
the user has no way of knowing which he is seeing (unless he uses
Edit, Find). But for active typing in Korean we still only use
Unicode 1.1 throughout our applications--otherwise it would be even
more confusing.

At some point we will switch over to actively typing Hangul using
Unicode 2.0 codepoints, and at that point there is much to be said
for handling Unicode 1.1-coded Hangul with alternative (2),
attempting to "upgrade" the user's files semi-automatically.

I welcome advice from others regarding this question.

Rick Kunst
 
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
Humanities Computing Facility Tel. (919) 660-3194
015 Language Center - Box 90269 Fax: (919) 660-3191
Duke University E-mail rkunst@acpub.duke.edu
Durham, NC 27708 USA http://www.lang.duke.edu
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/

------------------ RFC822 Header Follows ------------------
Received: by taligent.com with SMTP;21 Apr 1996 17:11:30 -0800
Received: from taligent.com by mailserv.taligent.com (AIX 3.2/UCB 5.64/4.03)
          id AA22287; Sun, 21 Apr 1996 18:12:51 -0700
Received: from unicode.org by taligent.com with SMTP (5.67/23-Oct-1991-eef)
        id AA03491; Sun, 21 Apr 96 18:09:35 -0700
        for
Received: by Unicode.ORG (NX5.67c/NX3.0M)
        id AA16566; Sun, 21 Apr 96 18:07:34 -0700
Date: Sun, 21 Apr 96 18:07:34 -0700
From: unicode@Unicode.ORG
Message-Id: <9604220107.AA16566@Unicode.ORG>
Reply-To: "RICK KUNST" <rkunst@mercury.acpub.duke.edu>
Errors-To: uni-bounce@Unicode.ORG
Subject: Handling obsolete codes
To: unicode@Unicode.ORG



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT