[Unicode]  Frequently Asked Questions Home | Site Map | Search

Language Tagging

Q: Do I always need to tag text with the language?

A: No, in most cases it is not necessary. For a more complete discussion, see Section 5.10, Language Information in Plain Text in The Unicode Standard.

Q: What are language tag characters?

A: The Unicode Standard contains a set of invisible format control characters, also known as "Plane 14 Language Tags", which can be used to spell out language tags that can be embedded into plain text. See Section 23.9, Deprecated Tag Characters in The Unicode Standard for a complete explanation.

Q: Should I be using those language tag characters?

A: No. Use of the language tag characters is strongly discouraged. They are encoded in the standard only for limited use by particular protocols which may need to provide language tagging for short strings, without the use of full-fledged markup mechanisms. Most other users who need to tag text with the language identity should be using standard markup mechanisms, such as those provided by HTML, XML, or other rich text mechanisms. In database contexts, language should generally be indicated by appropriate data fields, rather than by embedded language tags or markup.