RE: Latin vowels?

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Mon Sep 09 2002 - 05:43:52 EDT


Mark Davis wrote:
> I need to get a list of Latin characters that are generally considered
> vowels. I partitioned the characters as in the list below, but there
> are lots of oddball ones for which I can only guess (LATIN CAPITAL
> LETTER OU? LATIN LETTER WYNN?...).
>
> http://www.macchiato.com/unicode/latin_vowels.html

Er... Is something I should feel guilty about?

Anyway, here are a few comments:

1. List "Vowels" - probably not vowels:
        U+00AA # (ª) FEMININE ORDINAL INDICATOR
        U+00BA # (º) MASCULINE ORDINAL INDICATOR
        U+2071 # (ⁱ) SUPERSCRIPT LATIN SMALL LETTER I
        U+212B # (Å) ANGSTROM SIGN
I would classify all of these as symbols or punctuation.

2. List "Vowels" - ambiguous letters that can be consonants:
        U+0049 # (I) LATIN CAPITAL LETTER I
        U+0069 # (i) LATIN SMALL LETTER I
        U+0075 # (u) LATIN SMALL LETTER U
        U+FF29 # (I) FULLWIDTH LATIN CAPITAL LETTER I
        U+FF49 # (i) FULLWIDTH LATIN SMALL LETTER I
        U+FF55 # (u) FULLWIDTH LATIN SMALL LETTER U
I would treat all these as vowels, although I know a few rare exceptions:
- In Latin, i and u were also used to represent consonants /j/ and /v/
(originally, /w/). This ambiguity is still partly present in modern
languages, especially for i.
- Notice that capital U is not listed, because it is a new form of V,
invented in the 16th century precisely for the purpose of distinguishing
the vowel and the consonantal sounds.

3. List "Nonvowels" - definitely vowels:
        U+00DD # (Ý) LATIN CAPITAL LETTER Y WITH ACUTE
        U+00FD # (ý) LATIN SMALL LETTER Y WITH ACUTE
        U+00FF # (ÿ) LATIN SMALL LETTER Y WITH DIAERESIS
        U+0131 # (ı) LATIN SMALL LETTER DOTLESS I
        U+0132 # (IJ) LATIN CAPITAL LIGATURE IJ
        U+0133 # (ij) LATIN SMALL LIGATURE IJ
        U+0176 # (Ŷ) LATIN CAPITAL LETTER Y WITH CIRCUMFLEX
        U+0177 # (ŷ) LATIN SMALL LETTER Y WITH CIRCUMFLEX
        U+0178 # (Ÿ) LATIN CAPITAL LETTER Y WITH DIAERESIS
        U+0233 # (ȳ) LATIN SMALL LETTER Y WITH MACRON
        U+1E38 # (Ḹ) LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRON
        U+1E39 # (ḹ) LATIN SMALL LETTER L WITH DOT BELOW AND MACRON
        U+1E5C # (Ṝ) LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON
        U+1E5D # (ṝ) LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
        U+1E80 # (Ẁ) LATIN CAPITAL LETTER W WITH GRAVE
        U+1E81 # (ẁ) LATIN SMALL LETTER W WITH GRAVE
        U+1E82 # (Ẃ) LATIN CAPITAL LETTER W WITH ACUTE
        U+1E83 # (ẃ) LATIN SMALL LETTER W WITH ACUTE
        U+1E84 # (Ẅ) LATIN CAPITAL LETTER W WITH DIAERESIS
        U+1E85 # (ẅ) LATIN SMALL LETTER W WITH DIAERESIS
        U+1EF2 # (Ỳ) LATIN CAPITAL LETTER Y WITH GRAVE
        U+1EF3 # (ỳ) LATIN SMALL LETTER Y WITH GRAVE
        U+1EF8 # (Ỹ) LATIN CAPITAL LETTER Y WITH TILDE
        U+1EF9 # (ỹ) LATIN SMALL LETTER Y WITH TILDE
I see little or no possibility that these are consonants.
- The macron on L and R with dot below makes clear that they are used as
"sonants", i.e. a kind of vowels.
- W and Y with typical vowel diacritics are almost certainly vowels.

4. List "Nonvowels" - ambiguous letters that are probably vowels:
        U+0059 # (Y) LATIN CAPITAL LETTER Y
        U+0079 # (y) LATIN SMALL LETTER Y
        U+0174 # (Ŵ) LATIN CAPITAL LETTER W WITH CIRCUMFLEX
        U+0175 # (ŵ) LATIN SMALL LETTER W WITH CIRCUMFLEX
        U+01B2 # (Ʋ) LATIN CAPITAL LETTER V WITH HOOK
        U+01B3 # (Ƴ) LATIN CAPITAL LETTER Y WITH HOOK
        U+01B4 # (ƴ) LATIN SMALL LETTER Y WITH HOOK
        U+0281 # (ʁ) LATIN LETTER SMALL CAPITAL INVERTED R
        U+028B # (ʋ) LATIN SMALL LETTER V WITH HOOK
        U+028D # (ʍ) LATIN SMALL LETTER TURNED W
        U+1E36 # (Ḷ) LATIN CAPITAL LETTER L WITH DOT BELOW
        U+1E37 # (ḷ) LATIN SMALL LETTER L WITH DOT BELOW
        U+1E42 # (Ṃ) LATIN CAPITAL LETTER M WITH DOT BELOW
        U+1E43 # (ṃ) LATIN SMALL LETTER M WITH DOT BELOW
        U+1E46 # (Ṇ) LATIN CAPITAL LETTER N WITH DOT BELOW
        U+1E47 # (ṇ) LATIN SMALL LETTER N WITH DOT BELOW
        U+1E5A # (Ṛ) LATIN CAPITAL LETTER R WITH DOT BELOW
        U+1E5B # (ṛ) LATIN SMALL LETTER R WITH DOT BELOW
        U+1E86 # (Ẇ) LATIN CAPITAL LETTER W WITH DOT ABOVE
        U+1E87 # (ẇ) LATIN SMALL LETTER W WITH DOT ABOVE
        U+1E88 # (Ẉ) LATIN CAPITAL LETTER W WITH DOT BELOW
        U+1E89 # (ẉ) LATIN SMALL LETTER W WITH DOT BELOW
        U+1E8E # (Ẏ) LATIN CAPITAL LETTER Y WITH DOT ABOVE
        U+1E8F # (ẏ) LATIN SMALL LETTER Y WITH DOT ABOVE
        U+1E98 # (ẘ) LATIN SMALL LETTER W WITH RING ABOVE
        U+1E99 # (ẙ) LATIN SMALL LETTER Y WITH RING ABOVE
        U+1E9A # (ẚ) LATIN SMALL LETTER A WITH RIGHT HALF RING
        U+1EF4 # (Ỵ) LATIN CAPITAL LETTER Y WITH DOT BELOW
        U+1EF5 # (ỵ) LATIN SMALL LETTER Y WITH DOT BELOW
        U+1EF6 # (Ỷ) LATIN CAPITAL LETTER Y WITH HOOK ABOVE
        U+1EF7 # (ỷ) LATIN SMALL LETTER Y WITH HOOK ABOVE
        U+FF39 # (Y) FULLWIDTH LATIN CAPITAL LETTER Y
        U+FF59 # (y) FULLWIDTH LATIN SMALL LETTER Y
I would consider all these as vowels, although I know there is much room for
errors:
- Y is historically a vowel, and it still is mainly a vowel in all languages
using it (including English and French: "système", "quickly"). In English
and French, however, it can be a consonant (e.g., "yes"). In orthographies
derived from English-based romanizations (e.g., Pinyin), it is always a
consonant.
- L, M, N and R with dot below are normally used to indicate "sonants"; but
they could be something else.
- Most instances of W and Y with diacritics are vowels.

5. List "Nonvowels" - ambiguous cases:
        U+004A # (J) LATIN CAPITAL LETTER J
        U+0056 # (V) LATIN CAPITAL LETTER V
        U+0057 # (W) LATIN CAPITAL LETTER W
        U+006A # (j) LATIN SMALL LETTER J
        U+0076 # (v) LATIN SMALL LETTER V
        U+0077 # (w) LATIN SMALL LETTER W
        U+FB01 # (fi) LATIN SMALL LIGATURE FI
        U+FB03 # (ffi) LATIN SMALL LIGATURE FFI
        U+FF2A # (J) FULLWIDTH LATIN CAPITAL LETTER J
        U+FF36 # (V) FULLWIDTH LATIN CAPITAL LETTER V
        U+FF37 # (W) FULLWIDTH LATIN CAPITAL LETTER W
        U+FF4A # (j) FULLWIDTH LATIN SMALL LETTER J
        U+FF56 # (v) FULLWIDTH LATIN SMALL LETTER V
        U+FF57 # (w) FULLWIDTH LATIN SMALL LETTER W
I would treat all these as vowels, although I know a few rare exceptions.
Apart Welsh W, all other exceptions are very rare:
- J is normally a consonant, but it originally was a font variant of I. In
ancient texts in many languages, and in some rare Italian proper names, it
can still stand for vowel /i/.
- In Latin, V was originally used for vowel /u/. This usage can still be
found occasionally in Latin quotations, especially from French or German
authors. Notice: the value /u/ is much less likely for lowercase v.
- W is a vowel in Welsh.
- Ligatures <fi> and <ffi> contain consonant(s) and a vowel.

_ Marco



This archive was generated by hypermail 2.1.2 : Mon Sep 09 2002 - 06:46:50 EDT