Re: Is there a term for strictly-just-this-encoding-and-not-really-that-encoding?

From: Kenneth Whistler (
Date: Wed Nov 10 2010 - 16:53:32 CST

    Mark Davis wrote:

    > What are also tricky are the 'almost' supersets, where there are only a few
    > different characters. Those definitely cause problems because the difference
    > in data is almost undetectable.

    For example, Mark is referring to cases such as ISO 8859-1 and 8859-15.

    Those share all the same encoded characters except those at
    the code points 0xA4, 0xA6, 0xA8, 0xB4, 0xB8, and 0xBC..0xBE.

    So neither of the repertoires is a proper subset of the other,
    but the two coded character sets share the vast majority
    of their characters, including almost all of the common ones.


