DOI Name Encoding Rules for URL Presentation

<< Click to Display Table of Contents >>

Navigation:  Appendix > DOI Name Encoding >

DOI Name Encoding Rules for URL Presentation

Hex-encoding must be used when presenting a DOI name as a Uniform Resource Locator (URL) if the DOI name contains characters that are not allowed, or have other meanings, in the URL application context. Hex-encoding consists of substituting for the given character its hexadecimal value preceded by percent. For example, # becomes %23 and https://doi.org/10.1000/456#789 is encoded as https://doi.org/10.1000/456%23789 (Thus, the browser does not now encounter the bare #, which it would normally treat as the end of the URL and the start of a fragment, and so sends the entire string off to the DOI network of servers for resolution, instead of stopping at the #.).

The table below lists the mandatory and recommended hex-encoding rules (the recommendation was established based on a practical experience of the current web browsers).

 

Character

Encoding

Mandatory Rules

%

%25

"

%22

#

%23

SPACE

%20

?

%3F

Recommended Rules

<

%3C

>

%3E

{

%7B

}

%7D

^

%5E

[

%5B

]

%5D

`

%60

|

%7C

\

%5C

+

%2B

,  (only necessary in a Which RA service request context)

%2C

 

NOTE  The web browser treatment of /./ and /../ can be inconsistent. It is recommended that one of the slashes be percent encoded, for example, /./ is changed to /.%2F and /../ to /..%2F.

NOTE  To enable the use of DOI names in workflows that have already standardized on URNs, the DOI proxy servers understand the substitution of a colon in place of the initial slash in a DOI name. DOI names may therefore be expressed as URNs in the doi.org domain by writing, for example, the DOI name 10.123/456 in the form https://doi.org/urn:doi:10.123:456. However, a DOI suffix is allowed to contain other slashes, and where these occur they must be hex-encoded rather than replaced with a colon: for example, the DOI name 10.123/456ABC/zyz would become https://doi.org/urn:doi:10.123:456ABC%2Fzyz, with the final slash character encoded as %2F.