Arabic Requirements of the BMP

From: Rafik Belhadj (R.Belhadj@frcl.bull.fr)
Date: Mon Jan 13 1997 - 09:31:47 EST


 In message <9701101550.AA09086@Unicode.ORG> unicode@unicode.ORG writes:

>Arabic Requirements of the BMP (was: Re: Arabic requirements)
>Given that the Arabic presentation forms have raised their head,
>is it worth raising the question - who _does_intend to use this part of
>the Basic Multilingual Plane?

Definition: to be clear, in the following I speak about the
ligatures of 2 or 3 alaphabetic characters.
A ligature in this context is a concatenation of 2 or 3 characters.
A ligature has no meaning by itself. It's only a presentation variant.
There is only ONE exception, which is the Lam-Aleph
which is a ligature and a character by itself.
This is true for the ARABIC Language.
For this language, the basic characters are those of ISO-8859, no more
nor less.

As far as I know, the Arabic ligatures were first defined in the ECMA
 (the European Computer Manufacturers Association) Arabic Task
Group -ATG-.
In this Group, there were representatives from Bull (myself),
DEC, HP, IBM, Siemens and others.
We had meetings with Arabic countries standards organisations (ASMO, Egypt,
Tunisia, Jordan, Syria, Saudi Arabia, etc.).
In these meetings, I raised John's question. The answer was: nobody
using computers, because there was no computer fonts for these ligatures,
but there are non-computer fonts for some of them.
The Arabic writing is cursive, and people really want to present the
written text by hand, by printer or by any mean in the same way.

>If there is nobody, this is worth knowing for the developers of
>Unicode and ISO/IEC 10646.

These ligatures are used in books, newspapers, etc.

>John Clews
>In message <9701100206.AA13898@Unicode.ORG> unicode@Unicode.ORG writes:
>> Terry,
>> >
>> > | The ligatures, especially the Arabic ligatures, were encoded
>> > | "for compatibility". That is a polite way of saying they were
>> > | needed to meet some requirement to get the standard approved,
>> > | or were needed for backwards compatibility to some existing
>> > | encoding implementation which had a different model of text
>> > | representation. In the case of the Arabic ligatures, the
>> > | motivation was entirely for standards approval, because there
>> > | was no existing implementation.
>> >
>> > What was the specific requirement? I think the Arabic section
>> > is a mess and I can only imagine that it is the union of several
>> > fonts. Is that so? what were the fonts?

I do not agree with this assertion. The Arabic part of ISO-10646
is not a mess - at least for the Arabic Language part of it.
People of ATG are Arabic skilled people,
one of them became the Convenor of SC2 .
The ECMA/ATG work was welcommed by ISO.

>> The requirements were *standards* requirements. They can be traced
>> back to a series of JTC1/SC2/WG2 resolutions in 1992/1993 and related
>> input from
>> national standards bodies, comments on balloting of the standard,
>> etc. No doubt this is all recoverable from the WG2 minutes and
>> document archive. However, it was clearly *not* the result of some
>> existing font. The fonts which were used to print 10646 had to be
>> specially created for the Arabic ligatures sections, because no one
>> could locate an existing font which would cover them in sufficient
>> quality for the printing. The original documents requiring the
>> ligatures showed them in hand-drawn form. Unicode got the Arabic
>> ligatures (reluctantly, I might add) from 10646.
>> Regards,
>> --Ken Whistler

There are fonts for some of the ligatures (the most used).
These ligatures were not chosen for " standard purposes ", but they are
for User Requirements purposes: Users of the Arabic language
need them.

>> > | As the Unicode Standard clearly states, the preferred encoding
>> > | of Arabic does not use the encoded Arabic ligatures from
>> > | U+FB50..U+FDFF--and in fact their inclusion in the standard has
>> > | only made full support of Arabic more complicated, rather than
>> > | easier.

It is also easier to support non accented Latin characters (i.e. Englsih)
 than the accented ones. But this was not acceptable for
many Latin based languages.
The technology must help people to work correctly
in their preferred language.

>> > And beyond that there are full words and a symbol for "place
>> > of prayer" that I've never seen anywhere (rather like the
>> > "hot springs" symbol; perhaps drawn from some guidebook?).
>> >

Some of these symbols came from Users Requirements of the
Arabic language. Others came from Urdu and
many non-Arabic languages.
For these coming from the Arabic language users,
you can see them in many religious books.
 
>John Clews (Character Set Development) tel: +44 (0) 1423 888 432
>SESAME Computer Projects, 8 Avenue Road
>Harrogate, HG2 7PG, United Kingdom email: 10646er@sesame.demon.co.uk

 ==============================================================================
 Rafik Belhadj e-mail : R.Belhadj@frcl.bull.fr
 Bull S.A Tel : (33 1) 30.80.34.01
 Rue Jean Jaures B.P.68 , B2/126
 78340 Les Clayes sous Bois FRANCE Fax : (33 1) 30.80.70.78
 ==============================================================================



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:33 EDT