L2/00-135

ISO/IEC JTC1/SC2/WG2 N_____

DATE: 2000-04-07

DOC TYPE:

Expert contribution

TITLE:

Proposal to Add Urdu Epethit and Abbreviation Diacritics to Arabic Block

SOURCE:

Paul Nelson (Redmond, WA, USA), Ashhar Farhan (Hyderabad, India), Arif Hisam and Kashif Hisam (Pakistan Data Management Services, Karachi, Pakistan), John Clews (UK)

PROJECT:

 

STATUS:

Proposal

ACTION ID:

FYI

DUE DATE:

--

DISTRIBUTION:

Worldwide

MEDIUM:

Paper and web

NO. OF PAGES:

 


A. Administrative

1. Title

Proposal to Add Urdu Epethit and Abbreviation Diacritics to Arabic Block.

2. Requester's name

Paul Nelson (Redmond, WA, USA), Ashhar Farhan (Hyderabad, India), Arif Hisam and Kashif Hisam (Pakistan Data Management Services, Karachi, Pakistan), John Clews (UK)

3. Requester type

Expert request.

4. Submission date

2000-04-07

5. Requester's reference

 

6a. Completion

This is a complete proposal.

6b. More information to be provided?

Only as required for clarification.

 

B. Technical -- General

1a. New script? Name?

No.

1b. Addition of characters to existing block? Name?

Yes. Arabic.

2. Number of characters

10.

3. Proposed category

 

4. Proposed level of implementation and rationale

 

5a. Character names included in proposal?

Yes.

5b. Character names in accordance with guidelines?

Yes.

5c. Character shapes reviewable?

Yes.

6a. Who will provide computerized font?

Paul Nelson.

6b. Font currently available?

Paul Nelson.

6c. Font format?

TrueType.

7a. Are references (to other character sets, dictionaries, descriptive texts, etc.) provided?

Yes.

7b. Are published examples (such as samples from newspapers, magazines, or other sources) of use of proposed characters attached?

Yes.

8. Does the proposal address other aspects of character data processing?

 

 

 

C. Technical -- Justification

1. Contact with the user community?

Yes. Farhan is Director of Computer Corp, Urdu software company for PCs. Arif and Kashif are pricinples of Pakistan Data Management Systems, Urdu software systems.

2. Information on the user community?

Native.

3a. The context of use for the proposed characters?

Urdu has commonly used characters that are not included in Unicode 3.0 specification. The proposal includes characters identified by the National Language Authority of Pakistan.

3b. Reference

 

4a. Proposed characters in current use?

Yes.

4b. Where?

Native speakers in Pakistan, India and worldwide.

5a. Characters should be encoded entirely in BMP?

Already in BMP and in accordance with Roadmap.

5b. Rationale

 

6. Should characters be kept in a continuous range?

N/A.

7a. Can the characters be considered a presentation form of an existing character or character sequence?

No.

7b. Where?

 

7c. Reference

 

8a. Can any of the characters be considered to be similar (in appearance or function) to an existing character?

No.

8b. Where?

 

8c. Reference

 

9a. Combining characters or use of composite sequences included?

N/A.

9b. List of composite sequences and their corresponding glyph images provided?

N/A.

10. Characters with any special properties such as control function, etc. included?

No.

 

 

D. SC2/WG2 Administrative

To be completed by SC2/WG2

1. Relevant SC 2/WG 2 document numbers:

 

2. Status (list of meeting number and corresponding action or disposition)

 

3. Additional contact to user communities, liaison organizations etc.

 

4. Assigned category and assigned priority/time frame

 

Other Comments

 

 

Urdu has characters that are commonly used in printed material that have not yet been encoded into the Unicode Standard. These characters fall into three categories: Combining Diacritics, Epithets, and a ligature.

Urdu typesetting software/devices from Monotype and Urdu page composition software from companies like Computer Corporation currently support the input, storage and output of these characters. Adding these proposed characters to the Unicode Standard would facilitate the movement of data from proprietary data points to a universally accepted standard.

The characters we are proposing to add to the Unicode Standard are listed below by category type. We have also included a proposed location for the placement of these characters.

Combining Diacritics

These characters are used to assist in pronunciation.

Suggested Unicode

Glyph

Name

Example

0656

ARABIC LETTER SUBSCRIPT ALEF

 

0657

URDU LETTER JAZM

 

0658

URDU LETTER ULTA PESH

 

Epithets

These are characters that are commonly used as indicators of the standing, usually religious (Islamic), of a person. For example, one would normally place the ARABIC SMALL HIGH AIN over a person's name to indicate that they are a saint.

Suggested Unicode

Glyph

Name

Example

 

ARABIC SMALL HIGH REH HAH - Urdu symbol for rahmatullah

 

ARABIC SMALL HIGH REH DAD - Urdu symbol for raziallah

 

ARABIC SMALL HIGH SAD - Urdu symbol for salla

 

ARABIC SMALL HIGH AIN - Urdu symbol for alayhe assalam

 

ARABIC SMALL HIGH TALHALUS - Urdu symbol used above the name of a poet

 

ARABIC SMALL HIGH MISRA - Urdu poetic symbol which comes before starting a poetic verse

 

 

Ligatures

This ligature is commonly used in religious publications. It is much like other Arabic ligatures currently found in the FDFx range.

Suggested Unicode

Glyph

Name

Example

 

ARABIC LIGATURE ALAYHE ASSALAM = <isolated> + 0639 + 0644 + 064a + 0647 + 0020 + 0627 + 0644 + 0633 + 0644 + 0627 + 0645

 

References

Reading Nastaliq, William L. Hanaway and Brian Spooner, Mazda Publishers, 1995

Kallyat Aqbal, 1995

Diwan Galab, 1995

Urdu Page Composer Software, Computer Corporation, Hyderabad, India.

Urdu 98 Software and other Urdu software solutions by Pakistan Data Management Systems, Karachi, Pakistan