The Unicode Consortium Discussion Forum

The Unicode Consortium Discussion Forum

 Forum Home  Unicode Home Page Code Charts Technical Reports FAQ Pages 
 
It is currently Sat Nov 22, 2014 4:30 pm

All times are UTC - 6 hours [ DST ]




Post new topic Reply to topic  [ 3 posts ] 
Author Message
 Post subject: Question regarding Unicode bidi algorithm implementation
PostPosted: Thu Dec 02, 2010 3:09 pm 
Offline
Unicode Guru

Joined: Tue Dec 01, 2009 2:49 pm
Posts: 189
I received the following question which I am posting here:
Quote:
I am about to implement support for Arabic script in an embedded system, where I get
text data in Unicode format that will be rendered on a display. After some investigation
the Unicode bidi algorithm seems like a good thing to use, and more specifically I am
thinking of using either the C reference implementation or the ICU
implementation.

Which implementation should I use? (for instance with respect to size, since I have limited memory,
and ease of use / ease of integration). Maybe there is no simple answer to that, especially since
it would require some experience of both implementations, but I would be very happy with any kind
of opinion.

The C implementation seems to give a pretty small footprint, which is really good. Do you see any issues or problems
using the C reference implementation for my Arabic script rendering implementation? I guess the only work
needed is to do some wrapping and additions around the core algorithm code as was described in the
bidi.cpp comments.

The optimizations that were done in the C implemenation by using some tables, was that with respect to size or speed (or maybe both)?


Top
 Profile  
 
 Post subject: Re: Question regarding Unicode bidi algorithm implementation
PostPosted: Thu Dec 02, 2010 3:20 pm 
Offline
Unicode Guru

Joined: Tue Dec 01, 2009 2:49 pm
Posts: 189
Quote:
...the Unicode bidi algorithm seems like a good thing to use, and more specifically I am thinking of using either the C reference implementation or the ICU implementation.

Supporting the bidi algorithm is more than a "good thing", in fact it is required - with the specific exceptions mentioned
in the rules numbered "HL" in UAX#9.

However, all the bidi algorithm does is to determine the order of the elements on the display. For Hebrew, you would be nearly done at that point, but for Arabic, in particular, you still need to do shaping, that is to determine which positional variant glyph to use for which character (base on position in the word) and where to substitute ligatures (and how to place combining characters).

(That's for basic text display on a low-end device, for full text layout on a high-end publishing system, there's more).

Quote:
Which implementation should I use? (for instance with respect to size, since I have limited memory,
and ease of use / ease of integration). The C implementation seems to give a pretty small footprint, which is really good.

I wouldn't be surprised if the C code ends up pretty small, and reasonably fast in comparison, but it was not particularly designed to be integrated into anything, however adapting it should be straightforward.
Quote:

Do you see any issues or problems using the C reference implementation for my Arabic script rendering implementation? I guess the only work needed is to do some wrapping and additions around the core algorithm code as was described in the bidi.cpp comments.

The optimizations that were done in the C implementation by using some tables, was that with respect to size or speed (or maybe both)?



The C implementation was written as a reference implementation. However, I wanted to use tables to get to a more realistic performance than a typical reference implementation where the code follows the wording of the specification in a transparent manner. Since the Java implementation did that already, the idea was to compare the two and flush out places where the language of the spec wasn't precise enough (at the time) so that programmers following different coding strategies might end up with different interpretation of the rules. We did that and closed the loopholes, and the two versions produce identical results.

The Java reference code was never intended to be used in real code.

I believe the ICU folks took then wrote an optimized version of their own, but I know nothing about what it requires in footprint and how much of ICU you will need to drag in, in order to use it. I believe they do have some of the layout functionality I mentioned.


Top
 Profile  
 
 Post subject: Re: Question regarding Unicode bidi algorithm implementation
PostPosted: Thu Dec 02, 2010 4:07 pm 
Offline

Joined: Thu Dec 02, 2010 4:04 pm
Posts: 1
Take a look at GNU FriBidi. It proves a fast and faithful implementation of the bidi algorithm in C, plus the Arabic joining algorithm. For basic Arabic support, you can fribidi_log2vis() on your text and you're done.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 3 posts ] 

All times are UTC - 6 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


Quick-mod tools:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
Template made by DEVPPL.com