From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Nov 21 2004 - 16:23:46 CST
From: "Doug Ewell" <dewell@adelphia.net>
> Cryptically naming these two CSS classes ".he" and ".heb", which
> provides no indication of which is the Unicode encoding and which is the
> Latin-1 hack, merely makes a bad suggestion worse.
It was not cryptocraphic: "he" was meant for Hebrew (generic, properly
Unicode encoded, suitable for any modern Hebrew), and "heb" for Biblic
Hebrew where a legacy encoding may still be needed, in absence of workable
Unicode support for now: this won't be the same language however, so a
change of encoding may be justified. I was not advocating for mixing
encodings within the same text for the same language...
But I was nearly sure that a technical jargon in Hebrew would probably not
need Biblic Hebrew, except for illustration purpose within small delimited
block quotes or spans, where there will be simultaneously changes of:
- language level
- needed character set, some characters not being encodable with Unicode
- a needed changed encoding (from Unicode to Latin-1 override hack)
- specific font to render the legacy encoding.
In that case, it is acceptable to have the general text in modern Hebrew
properly coded with Unicode, even if the small illustrative quotes remain
fully in a non standard mapping, and won't appear correctly without the
necessary font.
Note that PDF files DO mix encodings within the embedded fonts that PDF
writers dynamically create for only the necessary glyphs. These encodings
are specific to the document, for each embedded font... This is why PDF
files can encode text that still don't have Unicode character mappings. You
can see that when you attempt to copy/paste text fragments from PDF files in
sections using embedded fonts; the pasted text will not reproduce the same
characters as what you can see in the PDF reader; copy/pasting however works
for PDF files using external fonts with standard mappings.
This archive was generated by hypermail 2.1.5 : Sun Nov 21 2004 - 16:25:46 CST