Re: Encoding italic

From: Mark E. Shoulson via Unicode <unicode_at_unicode.org>
Date: Wed, 23 Jan 2019 21:32:55 -0500

There is something deliciously simple, elegant... and kinda...
rebellious? about doing this.  And it wouldn't even be in purview of
Unicode.  "Yep, my HTML-renderer treats characters E0020..E007F just
exactly the same 0020..007F, 'cept that it won't render 'em."  And you
can send HTML text that looks for all the world like plain text to any
normal Unicode-conformant viewer.  Now, the security issues of being
able to write "invisible" JavaScript, or rather, Yet Another way you
need to look at and reveal possible code, are a headache for someone
else.  Viewed like this, you might do better taking this suggestion to
W3C and having them amend the HTML/XML specs so that E0020..E007F are
non-rendering synonyms for 0020..007F.  It wouldn't be a Unicode thing
anymore, just changing the definition of HTML.  (I'm not saying it would
be a GOOD idea, mind you.)

~mark

On 1/22/19 10:43 PM, James Kass via Unicode wrote:
>
> Nobody has really addressed Andrew West's suggestion about using the
> tag characters.
>
> It seems conformant, unobtrusive, requiring no official sanction, and
> could be supported by third-partiers in the absence of corporate
> interest if deemed desirable.
>
> One argument against it might be:  Whoa, that's just HTML.  Why not
> just use HTML?  SMH
>
> One argument for it might be:  Whoa, that's just HTML!  Most everybody
> already knows about HTML, so a simple subset of HTML would be
> recognizable.
>
> After revisiting the concept, it does seem elegant and workable. It
> would provide support for elements of writing in plain-text for anyone
> desiring it, enabling essential (or frivolous) preservation of
> editorial/authorial intentions in plain-text.
>
> Am I missing something?  (Please be kind if replying.)
>
> On 2019-01-20 10:35 AM, Andrew West wrote:
>
>> A possibility that I don't think has been mentioned so far would be to
>> use the existing tag characters (E0020..E007F). These are no longer
>> deprecated, and as they are used in emoji flag tag sequences, software
>> already needs to support them, and they should just be ignored by
>> software that does not support them. The advantages are that no new
>> characters need to be encoded, and they are flexible so that tag
>> sequences for start/end of italic, bold, fraktur, double-struck,
>> script, sans-serif styles could be defined. For example start and end
>> of italic styling could be defined as the tag sequences <i> and </i>
>> (E003C E0069 E003E and E003C E002F E0069 E003E).
>>
>> Andrew
Received on Wed Jan 23 2019 - 20:33:10 CST

This archive was generated by hypermail 2.2.0 : Wed Jan 23 2019 - 20:33:10 CST