From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Mon Jul 24 2006 - 03:24:42 CDT
On Sat, 22 Jul 2006, Curtis Clark wrote:
> I think it's a good idea.
The idea of creating a character that can be used as a marker for external
links is interesting, but there are many problems with it.
The description of existing practices shows that some web authors use
images of a certain type to mark external links as external. This does not
demonstrate existing usage of a _character_, since authors use rather
varying designs. Picking up just one class of designs seems premature.
Most web pages do not distinguish external links from internal links using
any markers.
The distinction between external and internal links can be important,
though often it's important just to site management, not to users, who
surf around the web and don't pay much attention to the "site" concept.
As some examples in the draft proposal indicate, the distinction is
often made on web pages as a formality, more or less: indicating a link as
external is a disclaimer of a kind, and mostly pointless. Sites that wish
to continue such practices will hardly be interested in using any
"standard character" as a marker, especially since the font support to the
character will be virtually nonexistent for a fairly long time. (What
matters is support in fonts that web users have in their computers, and
such things change slowly, even if good fonts were available for free.)
The concept "external" is somewhat vague in this context. Apparently it
means "external to the current site", but what is a "site", really? If we
expect that the new character will become widespread, or even "standard"
marker, it should perhaps have more definite meaning.
The most important question, however, seems to be whether it is even
desirable to have the internal vs. external distinction to be made at the
document level, and specifically at the character level within a document.
Using a marker - whether an image or a character - makes the distinction
part of the document's content in a manner that prescribes a particular
visual rendering. This is contrary to modern structured and
device-independent approach. The distinction, if relevant, is primarily a
metadata issue, and an attribute (in hypertext markup, e.g. HTML or XML)
could be used for the purpose. This would leave it to user agents to
render the distinction in a manner suitable for a particular browsing
situation. If a visual marker is used, it would most appropriately be an
image specific to the browser, i.e. part of the browser's user interface.
Thereby, it would perhaps not be suitable to treat it as a _character_,
even though it may appear in the midst of text. (External links could also
be indicated by the use of colors, for example, or they might look similar
to other links until the cursor is moved over the link.)
I think it basically belongs to the scope of the World Wide Web Consortium
to discuss whether a uniform, universal symbol is a desirable way to
indicate a link as external and whether the symbol should be part of a
document or part of a user agent's interface. Only after resolving that
could we adequately discuss whether that symbol should be encoded as a
character.
> One quibble: it is "web page" or "World Wide Web page", not "Internet page".
The difference between external and internal links can be relevant on
intranet pages, too, and in documents such as "standalone" HTML, XML,
Word, PDF, etc., documents. By "standalone" I mean that the document is
primarily for offline viewing but might also be used in an environment
where external (web) links would work. In such situations, the indication
of external links can be much more important than in normal web usage.
Basically, the distinction relates to any data format where a link
(hyperlink) concept is meaningful and a link may refer to something
external.
> 1. It will take a while for such a character to find its way into ubiquitous
> fonts, so web developers will need to use the graphic for a while longer. I
> don't see this as an argument against; *without* the character, they will
> have to wait forever.
I'm afraid that if the character were introduced, it would only be used
by a small minority of web authors, among the minority that marks external
links as external in the first place. In effect, it would be yet another
(and rarely used) symbol acting as external link marker, rather than a
"standard" marker.
As an aside, I think the name EXTERNAL LINK would not be quite adequate.
The name would suggest that the character _is_ a link (as it might
actually be, though more often it would be either part of the link text or
adjacent to it). So EXTERNAL LINK MARKER or EXTERNAL LINK INDICATOR would
be more descriptive. Perhaps even HYPERLINK instead of LINK, since the
word "link" as such is rather polysemic.
> 2. A graphic can have alternate text, such as "external link" for users who
> can't view images.
We have the unfortunate situation that in HTML, an image can have
alternate text but there is no corresponding construct for a character.
There is no way of specifying that if a particular character cannot be
displayed (or otherwise rendered) by a user agent, then a particular
replacement string (which would presumably contain "safe" characters only)
be rendered instead. This is one reason why authors so often use images
for symbols that actually exist in Unicode as characters, such as simple
arrows.
> It will take a while for screen readers to be programmed
> to have a pronunciation of the new character (I'm not sure how JAWS, the
> commonest screen reader in the United States, deals with symbol characters).
That's really an understatement. I'm afraid that speech-based user agents
usually deal with a fairly limited character repertoire (such as Basic
Latin, i.e. Ascii, perhaps with Latin 1 Supplement or some other
addition). If they were expected to deal with a considerably wider
repertoire, then the only sensible approach, for most characters, would be
to spell out the name of the character. However, using the Unicode name as
such is not generally a good idea, though perhaps the only possible way at
present. First, we know that some of the names are misleading. Second, the
name should appear in the language of the document, so localized names
would be needed, for each language supported by the program. Third, even
the localized name might be misleading in a particular context, since it
would relate to the character in general, not to its particular usage.
(For newly introduced characters with fairly simple semantics, like the
proposed one, this wouldn't be much of a problem, but I wanted to remind
of this general problem.)
> But again, this would eventually happen, and during the period after the
> availability of fonts, and before updates to screen readers, web developers
> could use the "title" attribute to identify the character.
The "title" attribute is an unreliable method. Although many speech-based
user agents are able to read its value, they are typically configured not
to do that by default. On visual user agents, the "title" attribute does
not affect the normal rendering at all; its value may be displayed as a
"tooltip" on mouseover, so it _may_ solve the puzzle _if_ the user sees a
suitable symbol of a missing glyph or unrecognized character _and_ the
user suspects that moving the pointer over the mystery may reveal
something.
-- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Mon Jul 24 2006 - 03:37:27 CDT