Re: Code charts and code points

From: Asmus Freytag <asmusf_at_ix.netcom.com>
Date: Fri, 24 Oct 2014 09:28:59 -0700

On 10/24/2014 9:01 AM, Michel Suignard wrote:
> I know for a fact (because I did it and just verified), that the font used for those codes use the real UCS code. The conversion happens in the PDF embedding magic. I could look into it, but I have no easy to debug the Adobe Distiller path here. Apparently when you get out of the beaten path for new characters, the preservation of code points in copy and paste operation is not bullet proof.

And this is presumably true in general, and the code substitutions would
then be "random", meaning that they do not establish an alternate
encoding for exchange purposes. That is different from releasing
ASCII-hacked or PUA fonts directly, because they do establish alternate
encodings and documents in them can be exchanged if viewed with the same
fonts.

A./
>
> Michel
>
> -----Original Message-----
> From: Unicode [mailto:unicode-bounces_at_unicode.org] On Behalf Of Jukka K. Korpela
> Sent: Friday, October 24, 2014 4:51 AM
> To: unicode_at_unicode.org
> Subject: Re: Code charts and code points
>
> 2014-10-24 11:17, "Martin J. Dürst" wrote:
>
>> The code charts are published as PDFs. In general, text in PDFs can be
>> copypasted elsewhere. Is there something in place that makes sure that
>> "wrong" Unicode encodings for glyphs published in code charts don't
>> leak elsewhere?
> It seems that there isn’t. Whether this is serious is a different issue.
>
> I tested with the arbitrarily chosen Ornamental Dingbats block, with the chart http://www.unicode.org/charts/PDF/Unicode-7.0/U70-1F780.pdf
> Opening it in Adobe Reader XI on Win 7, I was able to select the characters with the mouse and copy and paste them to a text editor, BabelPad. It shows most of them as just boxes, identified with the correct Unicode numbers; this is the expected behavior when the editor has no suitable font in its disposal. But instead of U+1F67C VERY HEAVY SOLIDUS and U+1F67D VERY HEAVY REVERSE SOLIDUS, it shows “/” and “/”, identified as U+002F SOLIDUS and U+005C REVERSE SOLIDUS.
>
> So apparently the font designer had placed the glyphs as assigned to SOLIDUS and REVERSE SOLIDUS, which is understandable. But this means that when the characters in the code charts are copied and pasted, or otherwise accessed at the character level, they are wrong characters.
>
> I think it is imaginable that someone wants to copy a block of characters from the code charts, as a handy way of getting them for inspection, e.g. for testing how some particular software renders them using some particular font(s). I would expect some confusion then if you had partly got all wrong characters (code points).
>
> Yucca
>
>
>
> _______________________________________________
> Unicode mailing list
> Unicode_at_unicode.org
> http://unicode.org/mailman/listinfo/unicode
>
> _______________________________________________
> Unicode mailing list
> Unicode_at_unicode.org
> http://unicode.org/mailman/listinfo/unicode

_______________________________________________
Unicode mailing list
Unicode_at_unicode.org
http://unicode.org/mailman/listinfo/unicode
Received on Fri Oct 24 2014 - 11:29:55 CDT

This archive was generated by hypermail 2.2.0 : Fri Oct 24 2014 - 11:29:55 CDT