RE: Unicode market acceptance

From: Edward Cherlin (edward.cherlin.sy.67@aya.yale.edu)
Date: Mon Mar 19 2001 - 05:05:36 EST


At 11:50 AM -0800 3/13/01, Hart, Edwin F. wrote:
>I believe that you are asking
>
>(1) when will most of the products be enabled for Unicode (I assume a fairly
>high implementation level, but not necessarily every script)?
>(2) when will most of the data that people use be encoded in Unicode?

Yes, two entirely different questions. Question 2 is easy: Not any
time soon, unless you reckon all ASCII data as UTF-8. However, if you
ask, when will multiscript data be mostly in Unicode, I think the
answer is, about another decade. Or if you ask, when will *new*
multiscript data be created in Unicode, that will be rather sooner.

If that sounds too long, remember that the question about getting
past ASCII was raised by Xerox and IBM in 1981 for the U.S. market,
and again by the Macintosh in 1984, and that most Usenet postings and
HTML source files from the U.S. are still in ASCII, and most of the
rest in Windows code pages.

>In 1993, I speculated that initial products would emerge in the 1995-1996
>time frame and would be refined to handle more scripts and more Unicode
>features and have a high implementation level by 2000.
>
>Clearly, I was wrong. : )

I made similar predictions in a published study in 1994. I think we
were not too dreadfully far off. A few products using Unicode
internally and capable of reading and writing files in one or more
Unicode formats were emerging in 1995. Windows 2000 and Office 2000
support 11 script systems,

Latin
Greek
Cyrillic
Hebrew
Arabic
Thai
Chinese
Korean
Japanese
Devanagari
Tamil

fairly reasonably at the Unicode 2.0 level.

(Aside: I am particularly pleased that Access handles CJK data
decently (not perfectly, of course), and I have plans to try it in a
few other scripts. I have imported CJKXREF and much of Unihan into
Access, with added fields for characters, and for Cangjie data.
Pronunciations written in native scripts (Zhuyin, Hangeul, Kana) will
be next, when I get time to write some conversion routines.
Performance in Access is far better for sorts and lookups than Word
tables (shudder) or Excel, and of course Access has queries besides.)

If a high level of Unicode support means rendering for all two dozen
scripts in Unicode 2.0, then obviously we haven't arrived. I believe
that the main remaining technical obstacle for Windows is good
support for complex shaping in several more Indic scripts, Tibetan,
and Lao.

Mac OS X supports even more scripts than Windows 2000, as I
understand it, although I haven't seen any specifics about commercial
application support. Presumably we can get details after the release
next week.

Decent Unicode 3.1 support for considerably more scripts should be
available in 2002. The ones with shaping issues are Mongolian,
Syriac, Myanmar, Sinhala, Khmer, and to a lesser extent Ethiopic.

>Ed Hart
>
>Edwin F. Hart
>edwin.hart@jhuapl.edu
>The Johns Hopkins University Applied Physics Laboratory
>11100 Johns Hopkins Road
>Laurel, MD 20723-6099
>USA
>+1-443-778-6926 (Baltimore area)
>+1-240-228-6926 (Washington, DC area)
>+1-443-778-1093 (fax)
>+1-240-228-1093 (fax)
>
>-----Original Message-----
>From: Suzanne M. Topping [mailto:stopping@bizwonk.com]
>Sent: Tuesday, March 13, 2001 13:36
>To: Unicode List
>Subject: RE: Unicode market acceptance
>
>
>
>
>> -----Original Message-----
>> From: Tex Texin [mailto:texin@progress.com]
>
>> We have estimates for (human) language usages on the web, its too
>> bad there isn't an estimate for when Unicode will dominate.
>
>You would think that you could project out some rough timeline for when
>Unicode crosses over to be the standard mechanism, meaning that other
>methods of character support will become a pain in the tush to
>implement. Since it is the default for all the new and evolving
>technologies (XML, SOAP, UDDI etc.) and since virtually all platforms
>are moving toward using these technologies, there will have to be a
>rollover point where it takes more work to deal with other character
>sets. The .NET framework should be in place in 2002, and the other
>platforms are rolling right along with their own shifts. So what might
>be a reasonable timeline? When will virtually all users around the world
>use Unicode-enabled browsers? By 2003? And when will the new frameworks
>and platforms grow to widespread use?
>
>Anyone want to throw a dart?
>

-- 

Edward Cherlin Generalist "A knot!" exclaimed Alice. "Oh, do let me help to undo it." Alice in Wonderland



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:20 EDT