Re: Moving The Hebrew Extended Block Into The SMP from Mark Shoulson on 2016-05-10 (Unicode Mail List Archive)

From: Mark Shoulson <mark_at_kli.org>
Date: Tue, 10 May 2016 22:46:04 -0400

On 05/10/2016 09:08 PM, Robert Wheelock wrote:
>
> ·U+30000—U+30014 (21 codepoints): Additional characters for
> typesetting Biblical/Classical Hebrew

Do you have this list available yet? I'm curious about these points,
and others.

> ·U+30015—U+3001F (11 codepoints): Palestinian vowel and pronunciation
> points for Hebrew and Galilean Aramaic
> ·U+30020—U+30021 (2 codepoints): Small superscript top-left signs for
> the letter /shin/—superscript śin and superscript shin

I thought SIN was indicated sometimes by a SAMEKH written above the
letter. How would putting a SIN (which is just a SHIN with a dot on the
left instead of the right) on top of the letter be any improvement (or
difference) over just putting the dot on the left of the base letter in
the first place?

> ·U+30022—U+30041 (32 codepoints): Palestinian cantillation signs for
> Hebrew and Galilean Aramaic
> ·U+30042 is reserved
> ·U+30043—U+3005C (26 codepoints): Babylonian vowel and pronunciation
> points for Hebrew
> ·U+3005D—U+3005F are reserved
> ·U+30060—U+30071 (18 codepoints): Babylonian cantillation signs for
> Hebrew
> ·U+30072—U+3007D are reserved
> ·U+3007E—U+3008F (18 codepoints): Samaritan vowel points,
> pronunciation points, and cantillation signs for Hebrew (copies of
> those also being used for Samaritan script in BMP)

OK, here I'm confused. Why do we need copies? Unicode doesn't like to
encode redundant things, and it only makes for messes (when do you use
which ZIQAA?) If we have the characters in the BMP, we don't need them
in the SMP.

> ·U+30090—U+3010F (128 codepoints): Additional characters in Hebrew
> script for other Jewish languages (these are pointed like the
> corresponding Arabic characters in the BMP)

So additional Hebrew "letters" that take Arabic vowel-points? Makes
sense; I saw some of that with Samaritan (particularly with DAMMA). We
should probably just use the Arabic vowel code-points though.

> ·U+30110—U+3012F (32 codepoints): Basic Hebrew superscript characters
> (regular letters+5 final forms+top-left pointed /śin/+top-right
> pointed /shin/+/maqqef/)
> ·U+30130—U+3014F (32 codepoints): Basic Hebrew subscript characters
> (regular letters+5 final forms+top-left pointed /śin/+top-right
> pointed /shin/+/maqqef/)

When you say "superscript" (or "subscript"), do you mean "spacing
character that's written small and raised/lowered"? Or do you mean
"combining character that's written above/below another character"? cf.
the difference between U+2071 SUPERSCRIPT LATIN SMALL LETTER I and
U+0365 COMBINING LATIN SMALL LETTER I). If the former, is there a
reason this has to be done as plain-text and can't be handled by
higher-level markup? Probably every major script has been written small
and high in some places, but we don't have superscript versions of every
letter in Unicode.

~mark
Received on Tue May 10 2016 - 21:46:20 CDT

This archive was generated by hypermail 2.2.0 : Tue May 10 2016 - 21:46:20 CDT