Re: Compressing Vietnamese with SCSU

From: Linus Toshihiro Tanaka (ttanaka@us.oracle.com)
Date: Tue Apr 18 2000 - 15:26:49 EDT


> I tested my newly written encoder with a fairly large Vietnamese file
> (59,871 characters) and came up with the following results:
>
> VISCII: 59,871 bytes
> UTF-16: 119,744 bytes (including BOM)
> UTF-8: 79,269 bytes
> SCSU: 67,781 bytes

VISCII utilizes some codepoints in 0x00 - 0x1F. Have you checked the
size in other Vietnamese encoding which doesn't utilize that area?

+----------------------------------------------------------------+
| Linus Toshihiro Tanaka 500 Oracle Parkway M/S 4op7 |
| NLS Consulting Team Redwood Shores, CA 94065 USA |
| Server Globalization Technology email: ttanaka@us.oracle.com |
| Oracle Corporation |
+----------------------------------------------------------------+



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:02 EDT