normalize-unicode

Normalizes according to a Unicode normalization form

Description

The fn:normalize-unicode function performs Unicode normalization, which allows text to be compared without regard to subtle variations in character representation. It replaces certain characters with equivalent representations. Two normalized values can then be compared to determine whether they are the same. Unicode normalization is also useful for allowing character strings to be sorted appropriately.

The $normalizationForm argument controls which normalization form is used, and hence which characters are replaced.

All implementations support the value NFC for $normalizationForm; some implementations may support other values.

Parameters

Description

arg:string()

the string to normalize

normalizationForm:string()

the normalization form

Examples

XPath

Results

Explanation

normalize-unicode(‘query’)

query

normalize-unicode(‘query’, ‘’)

query

normalize-unicode(’£’, ‘NFKC’)

£

Converts £ to £.

normalize-unicode(‘leçon’, ‘NFKD’)

lecon

Converts leçon to leçon. There are two characters in the output, a “c” and a separate cedilla, but they are probably not visible separately in your browser.

normalize-unicode(‘15 ㎗’)

15 ㎗

normalize-unicode(‘15 ㎗’, ‘NFKC’)

15 dl

Converts ㎗ to the letters ‘dl’.