Re: Unicode & space in programming & l10n

From: Jefsey_Morfin (jefsey@jefsey.com)
Date: Wed Sep 27 2006 - 16:13:32 CST

Next message: Jukka K. Korpela: "Re: (Not really?) Unicode question"

Previous message: Richard Wordingham: "Re: Double Aleph Mark"
Maybe in reply to: Don Osborn: "Unicode & space in programming & l10n"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

At 17:09 27/09/2006, ekolehma@kotus.fi wrote:
>I don't claim to have an IQ of 0, but I have quite some difficulty
>in understanding how this relates to any subject worthy of
>discussion on this list.

The point is the memory space/bandwidth being used by different
codes. As Mark Davis documented it the typographic level permits some
compression or better space management. However, this is limited.
Conceptual codes and processing permit to drastically reduce that
space in different manners. This is metacoms, one transmits the
metainformation necessary to give back the entered information in the
appropriate form to be intelligible to each reader. A simple example
if you receive the Bible in Chinese to be sent to a French and a
Spanish reader, you can use an OPES (open pluggable edge service)
identifying the text at the entry, transmitting its equivalent URN in
French and Spanish, and making them printed out of local library. If
you run a diff, you can even transmit the changes to make while
keeping the traffic low. You see that you have dramatically reduced
the load/memory and avoided to duplicate an existing file.

The problem you have is to initially identify the file as the Bible
and in Chinese (this is very rigid example - you will go by quotes or
concepts). So, you need a language recognition system which will be
transparent to the different possible character encodings. You will
have many other problems like font recognition, etc. But one of the
interesting basic problem is to support a diff in languages using
different upper case management systems. You cannot go by Unicode
tables. You need the full range of 12 graphemes options)You need
basic locale elements such as a grapheme sorting order and a way to
describe their usage equivalence in different cultures/styles.

Metacoms extended services are certainly new to most as an
architectural layer in a network model and in language modes. But
they are used all the time, without designers and developpers
noticing this is a general fundamental communication process. This is
typically what a CVS or a code is about. RFC 4646 says that if I want
to say "this text is in American English", you just write "en-us"
what represents a compression of 5/32.

jfc

Next message: Jukka K. Korpela: "Re: (Not really?) Unicode question"
Previous message: Richard Wordingham: "Re: Double Aleph Mark"
Maybe in reply to: Don Osborn: "Unicode & space in programming & l10n"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Sep 27 2006 - 16:15:13 CST