From: Mark Davis ☕ (mark@macchiato.com)
Date: Wed Nov 10 2010 - 15:28:28 CST
Mark
*— Il meglio è l’inimico del bene —*
On Wed, Nov 10, 2010 at 12:38, Asmus Freytag <asmusf@ix.netcom.com> wrote:
> If you want to get that point across to a general audience, you could use a
> more colloquial term, albeit one that itself derives from mathematics.
>
> Text that can be completely expressed in ASCII is fits into something
> (ASCII) that works as a "lowest common denominator" of a large number of
> character sets.
>
> You could call it "lowest common denominator" text.
>
> Since ASCII is the only set that exhibits such a lowest common denominator
> relationship with enough other sets to make it interesting, and since that
> relation is so well known, it's usually enough to just refer to it by name
> (ASCII) without needing a general term - except perhaps for general
> audiences that aren't very familiar with it.
>
That is actually not the case. There are superset relations among some of
the CJK character sets, and also -- practically speaking -- between some of
the windows and ISO-8859 sets. I say practically speaking because in general
environments, the C1 controls are really unused, so where a non ISO-8859 set
is same except for 80..9F you can treat it pragmatically as a superset.
What are also tricky are the 'almost' supersets, where there are only a few
different characters. Those definitely cause problems because the difference
in data is almost undetectable.
>
> In this kinds of discussions I find it invariably useful to mention that
> the copyright sign is not part of ASCII. (I suspect that it's the most
> common character that makes a text lose its "lowest common denominator"
> status).
>
> A./
>
>
>
>
>
>
> On 11/10/2010 11:41 AM, Jim Monty wrote:
>
>> Here's a peculiar question.
>>
>> Is there a standard term to describe text that is in some subset CCS of
>> another
>> CCS but, strictly speaking, is only really in the subset CCS because it
>> doesn't
>> have any characters in it other than those represented in the smaller CCS?
>>
>> (The fact that I struggled to phrase this question in a way that made my
>> meaning
>> clear -- and failed -- is precisely my dilemma.)
>>
>> Text that has in it only characters that are in the
>> ASCII character encoding is also in the ISO 8859-1 character encoding and
>> the
>> UTF-8 character encoding form of the Unicode coded character set, right? I
>> often
>> need to talk and write about text that has such multiple personalities,
>> but I
>> invariably struggle to make my point clearly and succinctly. I wind up
>> describing the notion of it in awkwardly verbose detail.
>>
>> So I'm left wondering if the character encoding cognoscenti have a special
>> utilitarian word for this, maybe one borrowed from mathematics (set
>> theory).
>>
>> Jim Monty
>>
>>
>>
>>
>>
>
>
This archive was generated by hypermail 2.1.5 : Wed Nov 10 2010 - 15:31:26 CST