RE: APL Under-bar Characters

From: <alexweiner_at_alexweiner.com>
Date: Mon, 17 Aug 2015 15:32:37 -0700
Hi Doug,

I think I am going to suggest that GNUAPL use http://www.unicode.org/Public/UCD/latest/ucd/NamedSequences.txt 
as previously suggested as it seems like it may provide a way for GNUAPL to support characters with under-bars, and ease all our parsing problems.
-Alex
-------- Original Message --------
Subject: Re: APL Under-bar Characters
From: "Doug Ewell" <doug@ewellic.org>
Date: Mon, August 17, 2015 9:23 am
To: "Unicode Mailing List" <unicode@unicode.org>

<alexweiner at alexweiner dot com> wrote:

> I have heard that the problem was brought to Unicode consortium
> before, and the answer was to just use the underline styling, as it is
> apparently equivalent, but I do not think it is.

Combining character sequences are not "styling." Combining character
sequences are plain text. They are not the same as marking a letter or
word or paragraph in your word processor and clicking a button to make
that text bold or italic or underlined.

In layman's terms, each combining sequence (base character plus any
number of combining characters) should be treated as a unit, regardless
of whether the sequence has been assigned a name. So these sequences are
indeed equivalent to the APL-specific "underlined letter" characters
used in non-Unicode systems.

> Underline styling usually connects the line from one letter to another
> l̲i̲k̲e̲ ̲t̲h̲i̲s̲.̲ The under-bar characters do not do such connecting,
> and are actually only for capital letters. so It would look more
> L̲ I̲ K̲ E̲ ̲ T̲ H̲ I̲ S̲ (I added the spaces for dramatic effect).

TUS 7.0, Section 7.9 does say:

> The characters U+0332 COMBINING LOW LINE, U+0333 COMBINING DOUBLE LOW
> LINE, U+0305 COMBINING OVERLINE, and U+033F COMBINING DOUBLE OVERLINE
> are intended to connect on the left and right.

In that case, despite the text in Section 22.7 that Ken quoted, it seems
that U+0331 COMBINING MACRON might be a better choice for APL
"underlined letters" than U+0332 COMBINING LOW LINE. Compare A̱ḆC̱
with A̲B̲C̲, noting that your font and rendering engine mileage may
vary.

"Voting again" to change one of the basic rules of Unicode, on the basis
that "perhaps feelings about the under-bar characters have changed since
then," is not expected to be an option, as David said.

> Then maybe we could work off that as a pseudo-standard?

Neither named or unnamed character sequences are a "pseudo-standard."
Both are part of the Unicode Standard.

--
Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸



Received on Mon Aug 17 2015 - 17:34:04 CDT

This archive was generated by hypermail 2.2.0 : Mon Aug 17 2015 - 17:34:04 CDT