Two things bug me about this:
1. Many developers unfamiliar with international issues conflate
"double-byte enabling" and/or Unicode with internationalization. If you are
talking strictly about the character handling aspects of I18N, which are not
entirely trivial, then, yes, it makes sense to talk about Unicode enabling
and/or character set processing. But in my role as a humble I18N consultant,
this is usually the first misconception that I have to break: not only does
character != byte, but also Unicode != I18N (!!) There are many issues such
as collation, date/time, number formatting, message formatting,
localization, etc. which Unicode does little or nothing to solve directly.
You'd be better off to advise that any code "be fully internationalized" and
go on to define that a little, rather than merely "multibyte enabled" or
"Unicode enabled". Part of the definition of "fully internationalized", of
course, will be "Unicode enabling"...
2. Also, while Unicode enabling is a Very Good Thing, it isn't always the
right solution. Unicode support solves a large number of international
problems. But the right solution for a specific project may need to take
into account other variables that point to multibyte enablement. I hate to
speak in hypotheticals, especially because I'm the last person on Earth to
want another non-Unicode-enabled application floating around out there, but
there *are* valid development reasons to multibyte enable an application
rather than Unicode enabling it. These reasons are almost always business
related (time and money) rather than being programming or architectural
issues (with unlimited time and money, Unicode is the right decision for a
One area where this is partially the case is the delivery of HTML.
Limitations and implementations of Unicode in older browesers make it a
less-ideal choice for delivery of Web pages in certain locales today. So a
"Unicode enabled" web server still needs to be able to recognize and convert
character sets appropriately, at least for a little while longer... and that
may involve multibyte enabled code.
So, while I completely agree with what everyone else on this thread has
written, I think it's important to caveat it a little.
Addison P. Phillips
Senior Globalization Consultant
Global Sight Corporation
Accelerating Global e-Business(TM)
(+1) 408.350.3600 - Telephone
Going global with your web site? Global Sight provides Web-based software
solutions that simplify the process, cut costs, and save time.
----- Original Message -----
From: Suzanne Topping <email@example.com>
To: Unicode List <firstname.lastname@example.org>
Sent: Thursday, April 06, 2000 10:27 AM
Subject: Re: Double Byte enabled
> Thanks to Ken, Murray, John, Andrea, and all others for your excellent
> comments and summaries of this issue.
> It helped gel the situation for me, and of course concluded that "Unicode
> enabled" is the best short description to use.
> ----- Original Message -----
> From: Kenneth Whistler <email@example.com>
> To: <firstname.lastname@example.org>
> Cc: <email@example.com>; <firstname.lastname@example.org>
> Sent: Wednesday, April 05, 2000 9:39 PM
> Subject: Re: Double Byte enabled
> > Suzanne,
> > > "Unicode enabled" is probably the clearest term, but I would
> > > comments on historic use of "multibyte enabled".
> > "multibyte enabled" was simply the extension of the term "doublebyte
> > for Chinese character sets, which overspilled the bounds of a two-byte
> > encoding. EUC-CNS has 1-, 2-, 3-, and 4-byte forms for characters.
> > (See Ken Lunde's great book for details on all of this.)
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:01 EDT