Re: Tag characters and in-line graphics (from Tag characters)

From: David Starner <prosfilaes_at_gmail.com>
Date: Wed, 03 Jun 2015 13:24:04 +0000

Chris wrote:
> There is no way to compare 2 HTML elements and know they are talking
about the same character

That's because character identity is a hard problem. Is the emoji TIGER the
same as TONY THE TIGER or as TONY THE TIGER GIVING THE VICTORY SIGN?

http://www.engadget.com/2014/04/30/you-may-be-accidentally-sending-friends-a-hairy-heart-emoji/

Note that even in Unicode, the set ẛ ᷥ ſ ṡ s S Ŝ may be considered the
same character or up to seven different characters, depending on
case-folding, canonization and accent dropping.

> Similarly, there is no way to search or index html elements. If a HTML
document contained an image of a particular custom character, there would
be no way to ask google or whatever to find all the documents with that
character. Different documents would represent it differently.

You can index links to images. If two documents represent it differently,
then I go back to the above; we can't know that they're the same thing.

On Tue, Jun 2, 2015 at 7:11 PM Chris <idou747_at_gmail.com> wrote:

> You can’t ask the entire computing universe to compress everything all the
> time.

Anytime we care about how much space text takes up, it should be
compressed. It compresses very well. On the other hand, it's rare that
anyone cares anymore; what's a few hundred kilobytes between friends?
Received on Wed Jun 03 2015 - 08:25:12 CDT

This archive was generated by hypermail 2.2.0 : Wed Jun 03 2015 - 08:25:12 CDT