normalize-unicode

Normalizes according to a Unicode normalization form

Description

The fn:normalize-unicode function performs Unicode normalization, which allows text to be compared without regard to subtle variations in character representation. It replaces certain characters with equivalent representations. Two normalized values can then be compared to determine whether they are the same. Unicode normalization is also useful for allowing character strings to be sorted appropriately.

The $normalizationForm argument controls which normalization form is used, and hence which characters are replaced.

All implementations support the value NFC for $normalizationForm; some implementations may support other values.

Parameters	Description
arg:string()	the string to normalize
normalizationForm:string()	the normalization form

Examples

XPath	Results	Explanation
normalize-unicode(‘query’)	query
normalize-unicode(‘query’, ‘’)	query
normalize-unicode(’£’, ‘NFKC’)	£	Converts ￡ to £.
normalize-unicode(‘leçon’, ‘NFKD’)	lecon	Converts leçon to leçon. There are two characters in the output, a “c” and a separate cedilla, but they are probably not visible separately in your browser.
normalize-unicode(‘15 ㎗’)	15 ㎗
normalize-unicode(‘15 ㎗’, ‘NFKC’)	15 dl	Converts ㎗ to the letters ‘dl’.