From: Mark Davis (firstname.lastname@example.org)
Date: Sat Feb 17 2007 - 16:55:28 CST
At least for English speakers, I've found a strong anecdotal correlation
between those who say UCS or ISO 10646 and those who say "octet" instead of
*73,600,000* for *byte
**7,650,000* for *octet*
As with your case, the problem is separating out the non-computer usage.
On 2/17/07, Asmus Freytag <email@example.com> wrote:
> On 2/17/2007 9:58 AM, Don Osborn wrote:
> > Does anyone currently use the term "Universal Character Set" (UCS) to
> > refer to Unicode/ISO-10646? I guess it is technically correct, but I
> > rarely see it. It seems that folks generally use "Unicode" as the
> > catch-all term, or maybe I'm missing a wider use of UCS?
> I believe your observation about "Unicode" being the common label are to
> the point. A bit of research is illuminating and might explain some of
> the reasons why the term has caught on.
> There are about 33 million pages indexed on Google that can be retrieved
> by a search for "Unicode" and about 111,000 by a search for "Universal
> character set". If you subtract all pages that mention 10646 or Unicode
> or UCS that number drops to 1/10th fir the altter. If you similarly
> subtract the other terms from the search for Unicode, there's hardly a
> reduction in number.
> What that means is that "universal character set" is probably most often
> used as a descriptor, as in "Unicode is a universal character set", and
> not as a label. The common label is clearly "Unicode". That's not
> surprising, because Unicode as a label has the advantage of being
> shorter and clearly referring to a specific character set.
> In the case of UCS as a label, you run into the problem that the letters
> UCS are not unique. Google will pull up the Union of Concerned
> Scientists, UCS Inc., University College School and a number of others
> on the first screen (and also helpfully suggest that you really meant
> USC). Trading non-distinctiveness for brevity is apparently not a clear
> win - and the use of UCS (in all meanings) is barely 1/6th of the one
> for Unicode. If you search for UCS together with 10646 or Unicode to
> sift out when UCS might have been used in the context of character sets,
> you find only about 800K inks, which only emphasizes the issue with the
> multiple meanings of UCS.
> 10646 by itself gives about 4.5 million hits, of which fully 1/3 don't
> mention ISO, but are in reference to part numbers or are otherwise false
> positives--based on that you can conclude that 10646 is used as a
> designator of the character set about 1/10th as often as Unicode.
> There are instances where referring to Unicode is the only correct
> choice. For example, when referring to Unicode Normalization Forms,
> Unicode Bidi Algorithm, Unicode Line Breaking, and the myriad other
> specifications that have been developed or are being developed around
> the character set and collection of character properties by the Unicode
This archive was generated by hypermail 2.1.5 : Sat Feb 17 2007 - 16:57:21 CST