Re: UTF-Morse

From: Doug Ewell (dewell@adelphia.net)
Date: Fri Nov 22 2002 - 00:54:28 EST

  • Next message: Thomas Lotze: "Re: Lowercase numerals"

    Yes, it's true. Marco had sent me his UTF-Morse proposal just
    yesterday, along with a suggestion that I put together an implementation
    for April Fool's Day. And darned if I wasn't really going to do it. As
    a JOKE.

    But Marco, you need to check your invented sequences again. The leading
    and trailing Morse code units for the (non-ASCII) multi-Morse characters
    conflict with some of the single-unit characters. For example,
    U+002D -....- looks like a leading unit, and U+0023 .-.-.. looks like a
    trailing unit.

    (It's only a JOKE, guys. Take a breath.)

    -Doug Ewell
     Fullerton, California

    ----- Original Message -----
    From: "Marco Cimarosti" <marco.cimarosti@essetre.it>
    To: "'Carl W. Brown'" <cbrown@xnetinc.com>; <unicode@unicode.org>
    Sent: Thursday, November 21, 2002 1:22 am
    Subject: UTF-Morse (was RE: Morse coded Unicode(was: Morse code))

    > Carl W. Brown wrote:
    > > I think that the bigger issue might be how do you extend Morse code
    to
    > > incorporate the Unicode character set.
    > > [...]
    >
    > Carl, this is unfair!! You spoiled my April 1st joke in mid November!
    >
    > Ciao.
    > Marco :-)
    >
    >
    >
    > ----------------------------------------------------------------------
    > UTF-Morse - "Bringing Unicode in the telegraph age!"
    >
    >
    > 1. Unicode characters U+0020..U+007E are encoded according to the
    > following table:
    >
    > Code: UTF-Morse: Character name:
    > ------ ----------- --------------------------
    > U+0020 / SPACE
    > U+0021 -----. EXCLAMATION MARK [1]
    > U+0022 .-..-. QUOTATION MARK
    > U+0023 .-.-.. NUMBER SIGN [1]
    > U+0024 ..-... DOLLAR SIGN [1]
    > U+0025 ..-..- PERCENT SIGN [1]
    > U+0026 ..-.-. AMPERSAND [1]
    > U+0027 .----. APOSTROPHE
    > U+0028 -.--.- LEFT PARENTHESIS
    > U+0029 -.---. RIGHT PARENTHESIS [1]
    > U+002A -.---- ASTERISK [1]
    > U+002B --.... PLUS SIGN [1]
    > U+002C --..-- COMMA
    > U+002D -....- HYPHEN-MINUS
    > U+002E .-.-.- FULL STOP
    > U+002F -..-. SOLIDUS [1]
    > U+0030 ----- DIGIT ZERO
    > U+0031 .---- DIGIT ONE
    > U+0032 ..--- DIGIT TWO
    > U+0033 ...-- DIGIT THREE
    > U+0034 ....- DIGIT FOUR
    > U+0035 ..... DIGIT FIVE
    > U+0036 -.... DIGIT SIX
    > U+0037 --... DIGIT SEVEN
    > U+0038 ---.. DIGIT EIGHT
    > U+0039 ----. DIGIT NINE
    > U+003A ---... COLON
    > U+003B ---..- SEMICOLON [1]
    > U+003C ---.-. LESS-THAN SIGN [1]
    > U+003D ----.. EQUALS SIGN [1]
    > U+003E ---.-- GREATER-THAN SIGN [1]
    > U+003F ..--.. QUESTION MARK
    > U+0040 -.-.-. COMMERCIAL AT [1]
    > U+0041 ..-- .- LATIN CAPITAL LETTER A [2]
    > U+0042 ..-- -... LATIN CAPITAL LETTER B [2]
    > U+0043 ..-- -.-. LATIN CAPITAL LETTER C [2]
    > U+0044 ..-- -.. LATIN CAPITAL LETTER D [2]
    > U+0045 ..-- . LATIN CAPITAL LETTER E [2]
    > U+0046 ..-- ..-. LATIN CAPITAL LETTER F [2]
    > U+0047 ..-- --. LATIN CAPITAL LETTER G [2]
    > U+0048 ..-- .... LATIN CAPITAL LETTER H [2]
    > U+0049 ..-- .. LATIN CAPITAL LETTER I [2]
    > U+004A ..-- .--- LATIN CAPITAL LETTER J [2]
    > U+004B ..-- -.- LATIN CAPITAL LETTER K [2]
    > U+004C ..-- .-.. LATIN CAPITAL LETTER L [2]
    > U+004D ..-- -- LATIN CAPITAL LETTER M [2]
    > U+004E ..-- -. LATIN CAPITAL LETTER N [2]
    > U+004F ..-- --- LATIN CAPITAL LETTER O [2]
    > U+0050 ..-- .--. LATIN CAPITAL LETTER P [2]
    > U+0051 ..-- --.- LATIN CAPITAL LETTER Q [2]
    > U+0052 ..-- .-. LATIN CAPITAL LETTER R [2]
    > U+0053 ..-- ... LATIN CAPITAL LETTER S [2]
    > U+0054 ..-- - LATIN CAPITAL LETTER T [2]
    > U+0055 ..-- ..- LATIN CAPITAL LETTER U [2]
    > U+0056 ..-- ...- LATIN CAPITAL LETTER V [2]
    > U+0057 ..-- .-- LATIN CAPITAL LETTER W [2]
    > U+0058 ..-- -..- LATIN CAPITAL LETTER X [2]
    > U+0059 ..-- -.-- LATIN CAPITAL LETTER Y [2]
    > U+005A ..-- --.. LATIN CAPITAL LETTER Z [2]
    > U+005B ..---. LEFT SQUARE BRACKET [1]
    > U+005C .-.... REVERSE SOLIDUS [1]
    > U+005D ..---- RIGHT SQUARE BRACKET [1]
    > U+005E .-...- CIRCUMFLEX ACCENT [1]
    > U+005F ------ LOW LINE [1]
    > U+0060 ...--- GRAVE ACCENT [1]
    > U+0061 .- LATIN SMALL LETTER A
    > U+0062 -... LATIN SMALL LETTER B
    > U+0063 -.-. LATIN SMALL LETTER C
    > U+0064 -.. LATIN SMALL LETTER D
    > U+0065 . LATIN SMALL LETTER E
    > U+0066 ..-. LATIN SMALL LETTER F
    > U+0067 --. LATIN SMALL LETTER G
    > U+0068 .... LATIN SMALL LETTER H
    > U+0069 .. LATIN SMALL LETTER I
    > U+006A .--- LATIN SMALL LETTER J
    > U+006B -.- LATIN SMALL LETTER K
    > U+006C .-.. LATIN SMALL LETTER L
    > U+006D -- LATIN SMALL LETTER M
    > U+006E -. LATIN SMALL LETTER N
    > U+006F --- LATIN SMALL LETTER O
    > U+0070 .--. LATIN SMALL LETTER P
    > U+0071 --.- LATIN SMALL LETTER Q
    > U+0072 .-. LATIN SMALL LETTER R
    > U+0073 ... LATIN SMALL LETTER S
    > U+0074 - LATIN SMALL LETTER T
    > U+0075 ..- LATIN SMALL LETTER U
    > U+0076 ...- LATIN SMALL LETTER V
    > U+0077 .-- LATIN SMALL LETTER W
    > U+0078 -..- LATIN SMALL LETTER X
    > U+0079 -.-- LATIN SMALL LETTER Y
    > U+007A --.. LATIN SMALL LETTER Z
    > U+007B --.-.. LEFT CURLY BRACKET [1]
    > U+007C --.--. VERTICAL LINE [1]
    > U+007D --.-.- RIGHT CURLY BRACKET [1]
    > U+007E --.--- TILDE [1]
    >
    >
    > 2. All other Unicode characters are encoded with one of seven
    > multi-Morse schemes:
    >
    > Code range: Scheme
    > ----------------- ------
    > U+0000..U+0007 1
    > U+0008..U+001F 2
    > U+007F..U+01FF 3
    > U+0200..U+0FFF 4
    > U+1000..U+7FFF 5
    > U+8000..U+3FFFF 6
    > U+40000..U+10FFFF 7
    >
    > Each scheme uses a Morse sequence of the form ".-.yyy", possibly
    > preceded by one or more Morse sequences in the form ".-.yyy":
    >
    > Scheme Bits (x: 0 or 1): UTF-Morse (y: "." if x is 0, "-" if x is
    1):
    > ------ --------------------
    > ------------------------------------------------
    > 1 00000000000000000xxx .-.yyy
    > 2 00000000000000xxxxxx -..yyy .-.yyy
    > 3 00000000000xxxxxxxxx -..yyy -..yyy .-.yyy
    > 4 00000000xxxxxxxxxxxx -..yyy -..yyy -..yyy .-.yyy
    > 5 000000xxxxxxxxxxxxxx -..yyy -..yyy -..yyy -..yyy .-.yyy
    > 6 000xxxxxxxxxxxxxxxxx -..yyy -..yyy -..yyy -..yyy -..yyy .-.yyy
    > 7 xxxxxxxxxxxxxxxxxxxx -..yyy -..yyy -..yyy -..yyy -..yyy -..yyy
    > .-.yyy
    >
    >
    > 3. Notes
    >
    > [1]: Some sequences are unique to UTF-Morse, and are unknown in
    > traditional Morse code.
    >
    > [2]: Capital letters use the same code as small letter, preceded by
    > sequence "..--" (which is unique to UTF-Morse).
    >
    > ----------------------------------------------------------------------
    ------
    > -
    >



    This archive was generated by hypermail 2.1.5 : Fri Nov 22 2002 - 03:08:22 EST