Re: Chinese Windows and unicode

From: Peter_Constable@sil.org
Date: Wed Jun 12 2002 - 10:06:06 EDT


On 06/11/2002 05:41:30 PM Lars Marius Garshol wrote:

>I was simply responding to Michael's blanket statement that
>Unicode on Win9x/Me is basically not an option, which is not
>strictly correct, [...]
>
>It isn't, but then he wasn't really saying that. He just said that the
>OS didn't support it, which is true.

Well, that's another blanket statement that overgeneralises. You need to
define what you mean by "support". It would be equally valid for someone
else to say that the OS *does* support it using some other definition. The
reality is that Win9x/Me provides limited support for Unicode that makes it
possible to do some things and not other things.

>You can of course do whatever you
>want in user space, which is basically your point.

That is not at all my point!

Let me clarify exactly what I mean:

The following Unicode-capable (BMP characters only) APIs are available on
Win9x/Me:

API Source Win95 Win98

TextOutW Win32 API Yes Yes
TextOutExW Win32 API Yes Yes
GetCharWidthW Win32 API Yes Yes
GetTextExtentPointW Win32 API Yes Yes
GetTextExtentPoint32W Win32 API Yes Yes
MessageBoxW Win32 API Yes Yes
MessageBoxExW Win32 API Yes Yes
lstrcpyW Win32 API No Yes
lstrcatW Win32 API No Yes
lstrlenW Win32 API No Yes
wcs* functions CRTL Yes Yes
Script* functions Uniscribe Yes# Yes

# except Far East versions

(Source: Figure 1 from F. Avery Bishop, _Design a Single Unicode App that
Runs on Both Windows 98 and Windows 2000_, available at
http://www.microsoft.com/msj/0499/multilangUnicode/multilangUnicodetop.htm)

Also, the WM_UNICHAR message that was added after WinMe shipped can be used
(between consenting apps and input methods) without requiring any change to
the OS.

These things mean that Win9x/Me is capable of at least the following:

- displaying any BMP codepoint (when using TextOutW or TextOutExW)

- complex-script rendering for certain scripts (exact support depends on
localised version of Windows used and on whether the Uniscribe processor
has been installed -- it is not installed by Win95 or Win98 -- but is not
available at all on Far East versions)

- allowing an app to convert keyboard input into Unicode, provided all of
the characters from a given input method fit within a single codepage (the
app receives an ANSI character and has to convert)

- allowing an app to accept *any* Unicode character from a third-party
input method (using WM_UNICHAR)

Of course, there are *lots* of things involving Unicode that cannot be done
on Win9x/Me, and for apps to work with Unicode inevitably requires at least
some extra work, possible a lot of extra work (but, as mentioned earlier,
using MSLU can provide a lot of benefits). Also, it must be noted that the
shell, file system and console in Win9x/Me do not provide any support for
Unicode. So, for instance, you can't create files with Armenian-script
names on Win9x/Me, and you really don't want to transfer files with lovely
Devanagari-script names created on Win2K to a Win9x machine over a LAN.

The summary is that, if you want to work with Unicode data to any great
extent, you are much better off going with Win2K/XP, and if you want
Unicode filenames, you must switch to WinNT/2K/XP. But, if you want to work
with Unicode data on Win9x/Me and don't mind the constraints, it is
possible within certain limits. As Tex pointed out, you need to specify the
need and compare that with the actual capabilities and limits of Win9x/Me.

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>



This archive was generated by hypermail 2.1.2 : Wed Jun 12 2002 - 08:31:18 EDT