Re: Double Byte enabled

From: Glen Perkins (Glen.Perkins@NativeGuide.com)
Date: Fri Apr 07 2000 - 05:06:23 EDT


It's both good and bad that people have the idea that "Unicode enabled" means "fully internationalized". Hopefully, the market can retain at least a bit of that misconception a little while longer. ;-)
 
Evidence of your point, Addison, is what happened with Java 1.0. Everyone who heard that Java was "Unicode based" seemed to assume that they could automatically display Japanese, for example. Boy, were they disappointed. You ought to think twice as a marketer before you put "Unicode enabled" on an otherwise internationally challenged product or tech support will hate you.

Just one small point of dissent: "...reasons to multibyte enable an application rather than Unicode enabling it..." It still seems to me that Unicode's various encodings -- those that cover the entire UCS -- are all "multibyte". I'm just not crazy about the terms "ANSI" or "multibyte" used to mean "not one of the Unicode encodings". Instead, I often use the term "legacy encoding" when I think I can get away with it. ;-)

__Glen__

----- Original Message -----
  From: Addison Phillips [GSC]
  To: Unicode List
  Sent: Thursday, April 06, 2000 2:38 PM
  Subject: Re: Double Byte enabled

  Well.... maybe...

  Two things bug me about this:

  1. Many developers unfamiliar with international issues conflate
  "double-byte enabling" and/or Unicode with internationalization. If you are
  talking strictly about the character handling aspects of I18N, which are not
  entirely trivial, then, yes, it makes sense to talk about Unicode enabling
  and/or character set processing. But in my role as a humble I18N consultant,
  this is usually the first misconception that I have to break: not only does
  character != byte, but also Unicode != I18N (!!) There are many issues such
  as collation, date/time, number formatting, message formatting,
  localization, etc. which Unicode does little or nothing to solve directly.

  You'd be better off to advise that any code "be fully internationalized" and
  go on to define that a little, rather than merely "multibyte enabled" or
  "Unicode enabled". Part of the definition of "fully internationalized", of
  course, will be "Unicode enabling"...

  2. Also, while Unicode enabling is a Very Good Thing, it isn't always the
  right solution. Unicode support solves a large number of international
  problems. But the right solution for a specific project may need to take
  into account other variables that point to multibyte enablement. I hate to
  speak in hypotheticals, especially because I'm the last person on Earth to
  want another non-Unicode-enabled application floating around out there, but
  there *are* valid development reasons to multibyte enable an application
  rather than Unicode enabling it. These reasons are almost always business
  related (time and money) rather than being programming or architectural
  issues (with unlimited time and money, Unicode is the right decision for a
  client/server system).

  One area where this is partially the case is the delivery of HTML.
  Limitations and implementations of Unicode in older browesers make it a
  less-ideal choice for delivery of Web pages in certain locales today. So a
  "Unicode enabled" web server still needs to be able to recognize and convert
  character sets appropriately, at least for a little while longer... and that
  may involve multibyte enabled code.

  So, while I completely agree with what everyone else on this thread has
  written, I think it's important to caveat it a little.

  thanks,

  Addison

  Addison P. Phillips
  Senior Globalization Consultant
  Global Sight Corporation
  Accelerating Global e-Business(TM)
  mailto:addison@globalsight.com
  ================================
  (+1) 408.350.3600 - Telephone
  http://www.globalsight.com
  ================================
  Going global with your web site? Global Sight provides Web-based software
  solutions that simplify the process, cut costs, and save time.

  ----- Original Message -----
  From: Suzanne Topping <stopping@rochester.rr.com>
  To: Unicode List <unicode@unicode.org>
  Sent: Thursday, April 06, 2000 10:27 AM
  Subject: Re: Double Byte enabled

> Thanks to Ken, Murray, John, Andrea, and all others for your excellent
> comments and summaries of this issue.
>
> It helped gel the situation for me, and of course concluded that "Unicode
> enabled" is the best short description to use.
>
> ----- Original Message -----
> From: Kenneth Whistler <kenw@sybase.com>
> To: <stopping@rochester.rr.com>
> Cc: <kenw@sybase.com>; <unicode@unicode.org>
> Sent: Wednesday, April 05, 2000 9:39 PM
> Subject: Re: Double Byte enabled
>
>
> > Suzanne,
> >
> > > "Unicode enabled" is probably the clearest term, but I would
  appreciate
> > > comments on historic use of "multibyte enabled".
> >
> > "multibyte enabled" was simply the extension of the term "doublebyte
> enabled"
> > for Chinese character sets, which overspilled the bounds of a two-byte
> > encoding. EUC-CNS has 1-, 2-, 3-, and 4-byte forms for characters.
> > (See Ken Lunde's great book for details on all of this.)
>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:01 EDT