Unicode, Java package/class names, JAR files

From: Adrian Havill (havill@threeweb.ad.jp)
Date: Wed Nov 12 1997 - 22:24:26 EST

Has anyone noticed the inconsistencies in the Manifest file specification
regarding Unicode, Unicode in filenames, and ASCII?

Example from <URL:http://java.sun.com/security/usingJavakey.html>:
> The manifest is an ascii file, defined by the spec at
> <URL:http://java.sun.com/products/jdk/1.1/docs/guide/jar/manifest.html>

Jump to the spec, and we see:
> The encoding of non-ASCII characters in filenames (if they are
> supported) is defined by the archive format [PKWARE's ZIP]"

Then later on in the same document:
> No line may be longer than 72 bytes (not characters), in its
> UTF8-encoded form

and the BNFish spec in the same document has the rule:
> otherchar: any Unicode character except NUL, CR and LF


My questions are:

1) The word "ASCII" referring to the manifest seems to be a technical
documentation error

2) Can PKWARE's ZIP format, which is used by JAR, support Unicode filenames (I
assume the answer is "Yes via UTF-8"? I can't find any documentation on this.
If this were so, JAR provides an answer to the problem of mapping non-ASCII
class filenames on systems that don't support Unicode filenames.

3) Do any actual implementations support this yet? When I force Unicode into a
MANIFEST file, it breaks both IE and Netscape and Hotjava and Appletviewer.

4) What ever happened to the proposed method of encoding Unicode package/class
names on non-Unicode filesystems via the method proposed in 7.2.1 (pg. 117) of
the Java Specification? ("@" + 4 hex characters for non ASCII)

5) Will JavaSoft and Microsoft ever modify the VMs for Windows 95 and NT to take
advantage of the Unicode filename support in NTFS (I believe the long filenames
in 95 are set-up to handle Unicode but is not currently implemented)?

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:38 EDT