Re: Archaic Pashto letter

From: <>
Date: Mon, 12 Dec 2011 13:24:23 -0600

I think I have an answer to a possible source of U+0682:

Grammar of the Pasto or Language of the Afghans, Compared with the
Iranian and North-Indian Idioms. By Dr. Ernest Trumpp. London and
Tuebingen, 1873. (Available from Google Books)

Page 1 (Page 24 of the PDF download from Google Books):

"Only one consonant has been left indistinct, the media [U+0685] d (=
dz), which is not distinguished from its tenuis [U+0685] t (= ts) by
separate diacritical marks. We have endeavoured to supply this want by
placing two dots above [U+062D], viz. [U+0682], as for a foreigner at
any rate the non-distinction of the two sounds must prove very

Indeed, some other 19th century grammars refer to Pashto [ts] and [dz]
as distinct letters but typeset them identically with three dots above
(that is, like U+0685). Here are two such examples:

A Grammar of the Pukkhto or Pukshto Language on a New and Improved
System, by Henry Walter Bellew, London 1867 (see alphabet table on
page 3, that is page 20 of the PDF download from Google Books).

A Grammar of the Pukhto, Pushto, or Language of the Afghans, by
Lieutenant H. G. Raverty, Calcutta 1855 (see alphabet table on pages
3-4, that is pages 77-78 of the PDF download from Google Books).

So it appears that the character "Hah with two dots vertical above"
was a 19th-century attempt to distinguish Pashto [ts] and [dz] for
didactic purposes. The convention of writing [dz] using Hah with hamza
above (U+0681) appears to have emerged later. There are still some
unanswered questions.

- Why did a character from a 19th-century book get coded in Unicode?
Did it ever receive wider use beyond Trumpp's book?

- Is the present hamza convention a development of the two vertical
dots proposal, or are they unrelated? About a year ago I worked with
several Afghan expatriates living in Southern California, and in
handwriting they would typically join two diacritical dots as a
squiggle rather than a line (which is more common in Arabic). One
could see how two vertical dots might develop into a vertical squiggle
and later into a hamza, especially given the note by Vladimir Ivanov
cited below. But this is only a conjecture at this point.

Anyway, I hope to have contributed a few pieces towards solving the puzzle :-)


Quoting Ken Whistler <>:

> On 12/9/2011 9:06 AM, Andreas Prilop wrote:
>> Arabic letter U+0682 shows two dots above.
>> It has the cryptic remark "not used in modern Pashto".
>> But was it ever used?
> To understand where the "cryptic" remark came from, you need to know
> more about the history of the character in the standard.
> U+0682 was encoded in Unicode 1.0. I don't have the material in hand
> right at the moment to track down its original source, but for these kinds
> of extensions to Arabic dating back to Unicode 1.0, it most likely in some
> poorly resolved handwritten or photocopied source labelled "Pashto"
> but without much analysis.
> However researching the exact details for that turns out, in Unicode 1.0
> the character was published with a note "Pashto".
> On February 13, 2003, Roozbeh Pournader sent a note around with a number
> of comments of Arabic character extensions and annotations. Among those
> notes was the statement:
> C6. For 0682: The comment is wrong. This is not used in modern Pashto
> (just rechecked with my Pashto dictionaries). I am back from Kabul
> doing a study of computer requirements of Pashto and didn't see this
> anywhere. I guess we should send a public email and ask if anybody
> knows what this is. [Just an alert. Don't do anything for now.]
> Then on March 19, 2003, Roozbeh followed up with another note:
>> 3. Comment for 0682: Remove 'Pashto'. This is not used in
>> modern Pashto.
>> Never. And not in loanwords. (May possibly be old Pashto.)
> Based on that note, and with no further clarification provided by anyone
> on the issue, I and the other editors modified the annotation in the
> Unicode *4.0*
> names list, so that it read "not used in modern Pashto".
> It has remained that way in the names list since that date.
> If Andreas (or anyone else) has better information, that can certainly be
> submitted, and the editors can then work to further clarify any
> annotation for
> the character.
> My own suspicion is that the original form from Unicode 1.0 may have been
> a hard-to-interpret glyph alternative for 0681. Note another note on the
> unicode email list from 2001, from Vladimir Ivanov. This note doesn't address
> 0682 specifically, but does raise questions about the exact nature and shape
> of the diacritic above the hah for dze in Pashto usage:
> ==============================================================
> Date: Fri, 8 Jun 2001 07:27:11 +0400
> My Pashto informants call it "dI paxto alifbe", saying it has 10 extra =
> letters.
> Letter "dze" is represented in Unicode by U+0681 "Arabic letter heh with =
> hamza above",
> though the sign above heh is not exactly hamza. It is a zigzag-like sign =
> of the same height as hamza, but they are well distinguished. My =
> informants could not recall any special name for it.
> If you use "heh with hamza above", people usually accept it as a =
> substitute, saying that "computer is not able to build a real Pashto =
> letter" (?!).
> I could not find such a letter in Unicode. I would be glad to hear some =
> comments on it.
> Sicerely,
> Vladimir Ivanov
> ==============================================================
Received on Mon Dec 12 2011 - 13:30:50 CST

This archive was generated by hypermail 2.2.0 : Mon Dec 12 2011 - 13:30:53 CST