RE: UTF8 vs. Unicode (UTF16) in code

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Mon Mar 12 2001 - 04:50:10 EST


Thomas Chan wrote:
> Look at DUTR #27[1] (2001.2.23), section 10.1, [...]

Thanks for all those pointers and explanations!

> How about the case of a retailer who needs to deal with parts for
> elevators and needs U+282E2, lip 'elevator'? Or neckties, requiring
> U+27639, taai 'tie'.

I am not seeking excuses to not implement UTF-16 -- rather examples of
characters that *do* justify it.

And all your examples are perfectly valid: it would be crazy to tell users:
"Sorry: because of software limitations, you cannot order ties or
elevators".

<OT>
Out of curiosity, are these loanwords from English? Or is it just a
coincidence that they sound like "lift" and "tie"?
</OT>

> print someone's name correctly!) If someone or some place's
> name happens to require a character from Plane 2, what're
> you going to do?

This is another valid reason too. And John Jenkins explained in another mail
that also Japanese has proper names in Plane 2.

> I was wondering earlier what kind of Cantonese messages would
> appear on a receipt or GUI. There is the issue that people
> who can read and write Cantonese are also diglossic in the
> mainstream standard written Chinese (based on Mandarin),
> which is understood by all schooled Chinese.

OK, maybe it was a poor example.

But it could happen. Consider the example of Spain: languages that were
considered "vernaculars" under the past fascist regime, just a few years
ago, are now official languages used in all public and private
communication.

However, I guess that Cantonese speakers might use dialectal terms (like
"lip" and "taai" above) even when writing in literary Mandarin. And
certainly they would not Mandarinize proper names.

> Pentagrams? I haven't seen those... where are they?

Hmmm... This is possibly an Italian word badly Anglicized. I just meant
"musical notation".

_ Marco



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:20 EDT