Compression of Unicode Strings

From: Tim Garton (
Date: Mon Sep 18 1995 - 10:20:31 EDT

Could someone give me some help with compression of Unicode strings. I am
working on a proposal to ETSI (European Telecommunications Standards Institute)
to support Unicode on GSM (Global System for Mobile Communications) Cellular
phones. These phones are digital cellular phones with the capability of
transmitting short text messages (approximately 140 bytes). Currently GSM
uses a 127 bit specialized character set to transmit these messages. This has
obviously proved inadequate as the GSM system has been adopted by over 60
countries (and growing rapidly) including, China, Hong Kong, India, Middle
eastern countries, all of Europe, Russia, and limited deployment in Canada,
Mexico, and the U.S., and others.

Currently there is a proposal before ETSI to adopt T.51, which is based on ISO
2022. I am desperately trying to kill this proposal in favor of Unicode, but
there is a lot of political inertia in ETSI. Probably the major concern about
adopting Unicode is the 16-bit representation of characters is too space
inefficient. I am trying to explain that a separate compression technique on
top of Unicode is a better solution than the T.51 proposal.

This is where I would like help. The current 7-bit encoding allows 160
characters to be transmitted in 140 bytes. My goal is to suggest a simple
compression technique that will allow at least 160 Unicode characters to be
compressed in to 140 bytes. This compression technique must be something that
can be implemented by machines, e.g., hand held cellular phones, that have very
few computing resources. Typically a phone would be able to dedicate no more
than 1K of RAM and approximately 10k of Code space to such an algorithm.

If anyone knows of a compression technique that would fit this bill I would
much appreciate them letting me know about it.

Also if anyone has any cogent arguments that clearly demonstrate the benefits
of Unicode over ISO 2022, I would much appreciate them forwarded to me as well.

Thanks for your time and help.


Tim Garton


Tim Garton Engineering Section Manager Motorola Inc. GSM Subscriber Engineering Phone: (708) 523-7790 E-mail: Fax: (708) 523-2545 Pager: 9404

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:30 EDT