Character Set Supported in the DOI Name

<< Click to Display Table of Contents >>

Navigation:  DOI Namespace > Syntax of the DOI Name >

Character Set Supported in the DOI Name

DOI Names may incorporate any printable characters from the Universal Character Set (UCS) of ISO/IEC 10646, which is the character set defined by Unicode. The Handle System at its core uses UTF-8, which is a Unicode implementation and so in its pure form has no character set constraints at all: any character can be sent to, stored in, and retrieved from a Handle server.

The character set encompasses most characters used in every major language written today (see also UTF-8 Encoding of Non-ASCII Characters).

NOTE  Some characters should be avoided: see Constraints on DOI Name Syntax in Specific Contexts.