I can think of a few websites that mix legacy encoded content withina utf-8
document.
Often done as a practicality.
Or alternatively mixing Unicode and pseudo-Unicode in same document.
Andrew
On 30/08/2013 11:14 PM, "Ilya Zakharevich" <nospam-abuse_at_ilyaz.org> wrote:
> On Wed, Aug 28, 2013 at 07:07:23PM +0000, Costello, Roger L. wrote:
>
> > For example, can some text be encoded as UTF-8 while other text is
> encoded as UTF-16 - within the same document?
>
> I think it is a very interesting question.  A Perl program is
> (obviously) a text document.  On the other hand, in two minutes I
> could deduce a few ways to mix many different encodings into the same
> document.  My current record is 5 different encodings; some of them
> are arbitrary, some of them should satisfy certain compatibility
> requirements (something like
>  =cut CR
> and
>  =pod CR
> being encoded the same in two encodings).  And, on top of this, is yet
> another way to mix encodings arbitrarily.
>
> The tricks are threefold:
>
>     ◌ First, a Perl program is actually a mixture of 3 different
>       documents: the program stream, the data-for-the-program stream,
>       and the documentation stream.  There are certain rules for
>       interleaving them (except for DATA which should be at the end!),
>       and there are documented way to specify encodings of the
>       streams.
>
>     ◌ Second, the string and regular-expression literals are
>       “interpreted” by the lexer: there is a way for the program to
>       specify a way to “massage” the literals before they are handled
>       to interpreter.  This gives yet other ways to have strings
>       and/or regular expressions to be in a different encoding.  (Note
>       that this may lead to “doubly encoded” phenomena if the
>       “ambient” encoding is not “raw”.)
>
>     ◌ Third, there is a way to switch the encoding of a Perl program
>       on the fly (at the end-of-line of current encoding).
>
> To be honest, I should have better tested all this before
> posting — but I did not.  On the practical side, how is this useful?
> Having different encoding for DATA and the program, and/or
> documentation and the program may be quite widely used.  The other
> hacks may have been used at least in the (enormous!) Perl test suite.
>
> Ilya
>
>
Received on Fri Aug 30 2013 - 17:44:32 CDT
This archive was generated by hypermail 2.2.0 : Fri Aug 30 2013 - 17:44:32 CDT