L2/99-349

SC2/WG2 N2075R

Date: 1999-09-09

 

Proposal to add Lithuanian Accented Letters to ISO/IEC 10646-1

 

Table of Contents

1. Official:

1.1. Proposal Request Form

1.2. Graphic Symbols and Names

2. Rationale:

2.1. Lithuanian Letters

      2.1.1. Main alphabet

      2.1.2. Extended alphabet (with accented letters)

2.2. 8-bit single-byte coding ( National standard code tables)

2.3. Multiple-Octet coding in ISO/IEC 10646-1 (UCS codes)

2.4. Samples

2.5. References
 
 
 
 
 
 

Lithuanian Standards Board: Kosciuskos 30, LT-2600 Vilnius, Lithuania

Phone: + 370-2-70 93 60,  fax: +370-2-22 62 52
 

Author and contact person: Vladas Tumasonis (Vilnius University and Lithuanian Standards Board)

E-mail: mailto:vladas.tumasonis@maf.vu.lt?subject=Lithuanian Accented Letters

Phone: +370-2-36 60 35, fax: +370-2-25 15 85
 
 


1. Official:

1.1. Proposal Request Form


ISO/IEC JTC 1/SC 2/WG 2
PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS
FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC 10646


Please fill Sections A, B and C below. Section D will be filled by SC 2/WG 2.

For instructions and guidance for filling in the form please see the document " Principles and Procedures for Allocation of New Characters and Scripts" (http://www.dkuug.dk/JTC1/SC2/WG2/prot)

A. Administrative


1. Title: Addition of Lithuanian Accented Letters



2. Requester's name: Lithuanian Standards Board (LST)


3. Requester type (Member body/Liaison/Individual contribution): Correspondent Member


4. Submission date: 1999-08-15


5. Requester's reference (if applicable):


6. This is a complete proposal.


B. Technical - General


1. The proposal is for addition of character(s) to an existing block.
Name of the existing block: LATIN EXTENDED-B


2. Number of characters in proposal: 35


3. Proposed category (see section II, Character Categories): A


4. Proposed Level of Implementation (see clause 15, ISO/IEC 10646-1): 1
Is a rationale provided for the choice?
If Yes, reference:


5. Is a repertoire including character names provided?: Yes

a. If YES, are the names in accordance with the 'character naming guidelines' in Annex K of ISO/IEC 10646-1? Yes
b. Are the character shapes attached in a reviewable form? Yes


6. Who will provide the appropriate computerized font (ordered preference: True Type, PostScript or 96x96 bit-mapped format) for publishing the standard? True Type; Fotonija UAB, Vilnius, Lithuania

If available now, identify source(s) for the font (include address, e-mail, ftp-site, etc.) and indicate the tools used: Mr. Virginijus Dadurkevicius; dadurka@fotonija.com


7. References:
a. Are references (to other character sets, dictionaries, descriptive texts etc.) provided? Yes

b. Are published examples (such as samples from newspapers, magazines, or
other sources) of use of proposed characters attached? Yes


8. Special encoding issues:

Does the proposal address other aspects of character data processing (if applicable) such as input, presentation, sorting, searching, indexing, transliteration etc. (if yes please enclose information): No


C. Technical - Justification



1. Has this proposal for addition of character(s) been submitted before? No

If YES explain


2. Has contact been made to members of the user community (for example: National Body, user groups of the script or characters, other experts, etc.)?

If YES, with whom?
If YES, available relevant documents?


3. Information on the user community for the proposed characters (for example: size,
demographics, information technology use, or publishing use) is included? Yes
Reference:


4. The context of use for the proposed characters (type of use; common or rare) Common
Reference:


5. Are the proposed characters in current use by the user community? Yes
If YES, where? Reference: In Lithuania


6. After giving due considerations to the principles in N 2002 must the proposed
characters be entirely in the BMP? Yes
If YES, is a rationale provided?
If YES, reference:


7. Should the proposed characters be kept together in a contiguous range (rather than
being scattered)? Can be scattered


8. Can any of the proposed characters be considered a presentation form of an existing
character or character sequence? Not existing characters, but they are fully composed forms of glyphs that can be represented as a composite sequence
If YES, is a rationale for its inclusion provided? Yes
If YES, reference: Is enclosed


9. Can any of the proposed character(s) be considered to be similar (in appearance or function) to an existing character? No
If YES, is a rationale for its inclusion provided?
If YES, reference:


10. Does the proposal include use of combining characters and/or use of composite sequences (see clause 4.11 and 4.13 in ISO/IEC 10646-1)? No
If YES, is a rationale for such use provided?
If YES, reference:

Is a list of composite sequences and their corresponding glyph images (graphic symbols) provided? No
If YES, reference:


11. Does the proposal contain characters with any special properties such as control function or similar semantics? No
If YES, describe in detail (include attachment if necessary)


D. SC 2/WG 2 Administrative (To be completed by SC 2/WG 2)


1. Relevant SC 2/WG 2 document numbers:


2. Status (list of meeting number and corresponding action or disposition):


3. Additional contact to user communities, liaison organizations etc:


4. Assigned category and assigned priority/time frame:



1.2. Graphic Symbols and Names

Number

Graphic

symbol

Name

Remarks

1

LATIN CAPITAL LETTER A WITH OGONEK AND ACUTE

 

2

LATIN SMALL LETTER A WITH OGONEK AND ACUTE

 

3

LATIN CAPITAL LETTER A WITH OGONEK AND TILDE

 

4

LATIN SMALL LETTER A WITH OGONEK AND TILDE

 

5

LATIN CAPITAL LETTER E WITH OGONEK AND ACUTE

 

6

LATIN SMALL LETTER E WITH OGONEK AND ACUTE

 

7

LATIN CAPITAL LETTER E WITH OGONEK AND TILDE

 

8

LATIN SMALL LETTER E WITH OGONEK AND TILDE

 

9

LATIN CAPITAL LETTER E WITH DOT ABOVE AND ACUTE

 

10

LATIN SMALL LETTER E WITH DOT ABOVE AND ACUTE

 

11

LATIN CAPITAL LETTER E WITH DOT ABOVE AND TILDE

 

12

LATIN SMALL LETTER E WITH DOT ABOVE AND TILDE

 

13

LATIN SMALL LETTER I WITH DOT ABOVE AND GRAVE

Name ?

14

LATIN SMALL LETTER I WITH DOT ABOVE AND ACUTE

Name ?

15

LATIN SMALL LETTER I WITH DOT ABOVE AND TILDE

Name ?

16

LATIN CAPITAL LETTER I WITH OGONEK AND ACUTE

 

17

LATIN SMALL LETTER I WITH OGONEK AND DOT ABOVE AND ACUTE

Name ?

18

LATIN CAPITAL LETTER I WITH OGONEK AND TILDE

 

19

LATIN SMALL LETTER I WITH OGONEK AND DOT ABOVE AND TILDE

Name ?

20

LATIN CAPITAL LETTER J WITH TILDE

 

21

LATIN SMALL LETTER J WITH TILDE

 

22

LATIN CAPITAL LETTER L WITH TILDE

 

23

LATIN SMALL LETTER L WITH TILDE

 

24

LATIN CAPITAL LETTER M WITH TILDE

 

25

LATIN SMALL LETTER M WITH TILDE

 

26

LATIN CAPITAL LETTER R WITH TILDE

 

27

LATIN SMALL LETTER R WITH TILDE

 

28

LATIN CAPITAL LETTER U WITH OGONEK AND ACUTE

 

29

LATIN SMALL LETTER U WITH OGONEK AND ACUTE

 

30

LATIN CAPITAL LETTER U WITH OGONEK AND TILDE

 

31

LATIN SMALL LETTER U WITH OGONEK AND TILDE

 

32

LATIN CAPITAL LETTER U WITH MACRON AND ACUTE

 

33

LATIN SMALL LETTER U WITH MACRON AND ACUTE

 

34

LATIN CAPITAL LETTER U WITH MACRON AND TILDE

 

35

LATIN SMALL LETTER U WITH MACRON AND TILDE

 

 


2. Rationale:

2.1. Lithuanian Letters

2.1.1. Main alphabet

Lithuanian by its grammatical structure is one of the most ancient languages of living Indo-European languages. It is spoken approximately by 5 millions people and is delivered at many Universities all over the world for linguistic studies.

The main Lithuanian alphabet consists at the Latin alphabet (excluding Q, q, W, w, X, x) with extra 18 letters with diacritics (9 capital and 9 small):

These letters are included in 8-bit single-byte coded character sets (ISO/IEC 8859-13, MS CP 1257, IBM CP 775, etc.). Thus there are no problems to use them.

2.1.2. Extended alphabet (with accented letters)

Lithuanian has a free word stress: stress may fall on every syllable of the word. it performs at least two functions. Its constitutive function manifests itself in distinguishing word from a combination of words, cf.:

The second function of word stress is the distinctive function, which distinguishes otherwise identical words by the place where the stress falls, e.g.:

For the word stressing (or accenting) there are three accent marks (or diacritical marks in ISO terms): grave accent, acute accent and tilde. The position of the stress depends on the stress pattern (or accentual paradigm) of the word and its morphological structure (see examples above).

Word stress is expressed by the means of accented letters.

There are 68 accented letters in the Lithuanian language:

The accented letters together with main letters comprise the extended alphabet.

Usage of accented letters goes back to the first Lithuanian writings. The first Lithuanian books were accented, e.g. "Kathechismas" (1595) and "Postilla catholicka" (1599). At present, the publishing practice all dictionaries, special vocabularies and encycklopaediae are accented. Accented letters are used in textbooks for schools, reference books, linguistic texts, and in publication of laws.

In common press (newspapers, fiction, etc.) only the letters of the main Lithuanian alphabet are used. Accented letters are used only in those words where it has a distinctive function.


2.2. 8-bit single-byte coding (National standard code tables)

There are three national code tables in Lithuania for encoding extended alphabet (usually we say "for encoding accented letters"). The basic Lithuanian code table is for UNIX environment (the second half of this table is shown in fig. 1). It defines the basic character repertoire including accented letters. This code table is conformant with ISO/IEC 8859-13, i. e. the codes of all Lithuanian main letters in both tables are the same. Common use and very important graphic characters are retained. The repertoire of this table is optimal for linguistic text processing.

Code table for Windows OS contains the basic repertoire and extra phonetic symbols in 8 and 9 columns. This code table is conformant with 8859-13.

Code table for DOS contains basic repertoire and box drawing symbols and is conformant with IBM CP 775 for Baltic States. DOS environment is still popular in publishing houses.

Fig. 1. UNIX code table for Lithuanian accented letters (second half)


2.3. Multiple-Octet coding in ISO/IEC 10646-1 (UCS codes)

All letters of main Lithuanian alphabet have UCS codes (codes in ISO/IEC 10646-1) or UNICODE codes. The situation with Lithuanian accented letters is more complicated. As it was mentioned, Lithuanian accented letters are Latin script letters with grave accent, acute accent or tilde. So some Lithuanian accented letters are also the common letters in other languages. For example, LATIN LETTER A WITH ACUTE is also in Irish, Icelandic, Portuguese, Slovak etc. languages, LATIN LETTER N WITH TILDE is also in Basque, Breton and Spanish languages. Thus they have separate UCS codes.

All together there are 33 Lithuanian accented letters which have separate UCS codes and 35 accented letters have not separate UCS codes.

Not shadowed letters have UCS codes; shadowed letters have not UCS codes.

There is another problem with small letter "i" (and "i with ogonek"). Lithuanian letter "i" is with a dot above. All accented forms of "i" should be also with a dot (see samples in 2.4). In ISO/IEC 10646-1 all such forms are dotless. For example, LATIN SMALL LETTER I WITH ACUTE in fact specifies "Latin small letter dotless i with acute". We ought to retain a dot above, in that case, so we should define these letters as LATIN SMALL LETTER I WITH DOT ABOVE AND ACUTE (or may be LATIN SMALL LETTER DOTLESS I WITH DOT ABOVE AND ACUTE).


2.4. Samples

In [3, p.350]:

In [12, p.75]:

In [4, p.38]. Note the accented "i":


2.5. References

 

1. M. Dauksa, Kathechismas (1595) and Postilla catholicka (1599).

2. Lietuvių kalbos žodynas, I–XVIII t. [Dictionary of Lithuanian Language, I–XVIII volumes], Vilnius, 1956–1997.

3. Dabartinės lietuvių kalbos žodynas, vyr. red. St. Keinys [Dictionary of Modern Lithuanian Language, ed. by St. Keinys], Vilnius, Mokslo ir enciklopedijų leidykla, 1993.

4. Adelė Laigonaitė, Zigmas Zinkevičius, Lietuvių kalba. Mokomoji knyga X klasei [Lithuanian Language. Textbook for X form], Kaunas, Sviesa, 1997.

5. S. Matulaitienė, Skaitiniai. Vadovėlis VI klasei [Lithuanian Texts. Textbook for VI form], Kaunas, Sviesa, 1990.

6. Lithuanian Grammar, ed. by V. Ambrazas, Vilnius, Baltos lankos, 1997.

7. T. Mathiassen, A Short Grammar of Lithuanian, Slavica Publishers, Columbus, Ohio, 1996.

8. M. Ramonienė, I. Press, Colloquial Lithuanian, London and New York, Routledge, 1996.

9. B. Svecevičius, B. Piesarskas, Lietuvių - anglų kalbų žodynas [English - Lithuanian Dictionary], Vilnius, Mokslas, 1979.

10. Vokiečių - lietuvių kalbų žodynas [German - Lithuanian Dictionary], Vilnius, Mokslas, 1989.

11. A. Parenti, Italiano - Lituano, Lituano - Italiano, Garzanti Editore, 1994.

12. Romos Misiolas. Gedulinis Misiolas [Missalis Romani. Missale Parvum], Kaunas - Vilnius, 1982.

13. Tarptautinių žodžių žodynas, ats. red. V. Kvietkauskas [Dictionary of International words, ed. by K. Kvietkauskas], Vilnius, Vyriausioji enciklopedijų redakcija, 1985.