The Unicode Consortium Discussion Forum (CLOSED)

The Unicode Consortium Discussion Forum (CLOSED)

The forum has been closed, but prior postings are accessible for reading.
 Forum Home  Unicode Home Page Code Charts Technical Reports FAQ Pages 
 
It is currently Mon Dec 22, 2014 9:54 am

All times are UTC - 6 hours [ DST ]




Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 
Author Message
 Post subject: BiDi question - problems with numbers
PostPosted: Wed Nov 30, 2011 2:42 pm 
Offline

Joined: Wed Nov 30, 2011 12:08 pm
Posts: 2
I'm wondering if someone can help me.

I am trying to convert some data coming from an IBM iSeries system encoded in EBCDIC (Arabic CCSID 420) to UTF-16. The original data consists of a date, some Arabic text and a currency amount, all stored (as is usual for Arabic EBCDIC data on iSeries ) in visual sequence, e.g.:

01/10/12 TXET CIBARA 3,902.07

I am successfully converting this using ICU to UTF-16 and then applying bidi processing to produce UTF-16 data that has the Arabic text stored in logical sequence:

01/10/12 ARABIC TEXT 3,902.07

HOWEVER, when this is displayed by NotePad, what the user sees is:

01/10/12 3,902.07 TXET CIBARA

In other words, the Arabic is display correctly (RTL) but the currency amount has moved to the left of it. This is not acceptable to my users, who require the order of the items (date, text, amount) on the page to remain unchanged.

I am guessing this is happening because of this logic, described by the Unicode bidi spec http://unicode.org/reports/tr9/#BD1:

Quote:
Examples. A list of numbers separated by neutrals and embedded in a directional run will come out in the run’s order.

Storage: he said "THE VALUES ARE 123, 456, 789, OK".

Display: he said "KO ,789 ,456 ,123 ERA SEULAV EHT".

In this case, both the comma and the space between the numbers take on the direction of the surrounding text (uppercase = right-to-left), ignoring the numbers. The commas are not considered part of the number because they are not surrounded on both sides by digits (see Section 3.3.3, Resolving Weak Types).


My question is: can I override this logic somehow, e.g. through the use of format codes, to ensure that the data items (date, Arabic text, currency amount) appear in the correct sequence, but the Arabic is still displayed correctly? I have so far failed to find any set of format codes that makes this happen.

Thanks in advance for any assistance you can offer.


Top
 Profile  
 
 Post subject: Re: BiDi question - problems with numbers
PostPosted: Thu Dec 01, 2011 2:15 am 
Offline

Joined: Sun Aug 22, 2010 5:14 am
Posts: 5
What you are trying to do is to transform a visual string into a logical LTR string. However, there are visual strings which cannot have an equivalent logical string without adding a directional mark, in this case the LRM (U+200E).
The most frequent such case is when a number appears to the right of RTL text and to the left of LTR text (or to the left of the end-of-string in your case).
To fix it, you have to add a LRM before the number after the conversion to UTF-16 and before calling ICU to transform the string to logical.
You can do this automatically if when calling ICU to transform the string from visual to logical you set the reordering mode to UBIDI_REORDER_INVERSE_LIKE_DIRECT, and set the reordering option to UBIDI_OPTION_INSERT_MARKS, if you use ICU4C, or the equivalent REORDER_INVERSE_LIKE_DIRECTand OPTION_INSERT_MARKS if you use ICU4J.


Top
 Profile  
 
 Post subject: Re: BiDi question - problems with numbers
PostPosted: Thu Dec 01, 2011 5:27 am 
Offline

Joined: Wed Nov 30, 2011 12:08 pm
Posts: 2
Thank you for your very helpful response. We will explore some of the options you mention and will let you know if we are able to resolve our issue.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 

All times are UTC - 6 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 0 guests


Quick-mod tools:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
Template made by DEVPPL.com