On 1/5/2016 8:26 AM, Markus Scherer wrote:
> I would specify that UTF-8 must be used, without mapping.
> US-ASCII is a proper subset, so need not be mentioned explicitly, nor
> distinguished in the protocol.
> Mappings would require that all implementations carry relevant data,
> and are up to date to recent versions of Unicode, or else
> previously-unassigned code points will cause failures.
> As long as a user types the same password the same way, or with IMEs
> that produce the same output, they are fine. Strange variants might
> improve password security.


In PRECIS, UTF-8 is enforced. However as you point out, the issue is
that "strange variants" exist, as well as different IMEs and different
keyboard/keystroke combinations. A case in point is that 0xFF is not a
valid UTF-8 octet. However, nothing constrains the underlying technology
not to use 0xFF, so there should be a way for a user (or process) to
force the use of specific octet strings as inputs. That is why the
"password-mapping" parameter is proposed as a hint rather than a strict

Also as pointed out, PKCS#8 encrypted blobs are used within PKCS #12,
which has its own Unicode mapping (based on UTF-16LE).

