Описание
Arbitrary HTML present after sanitization because of unicode normalization
Impact
If using keep_typographic_whitespace=False (which is the default), the sanitizer normalizes unicode to the NFKC form at the end. Some unicode characters normalize to chevrons; this allows specially crafted HTML to escape sanitization.
Patches
The problem has been fixed in 2.4.2.
Workarounds
Set keep_typographic_whitespace=True explicitly, or normalize to NFKC yourself earlier.
Пакеты
html-sanitizer
< 2.4.2
2.4.2
EPSS
CVE ID
Связанные уязвимости
html-sanitizer is an allowlist-based HTML cleaner. If using `keep_typographic_whitespace=False` (which is the default), the sanitizer normalizes unicode to the NFKC form at the end. Some unicode characters normalize to chevrons; this allows specially crafted HTML to escape sanitization. The problem has been fixed in 2.4.2.
html-sanitizer is an allowlist-based HTML cleaner. If using `keep_typographic_whitespace=False` (which is the default), the sanitizer normalizes unicode to the NFKC form at the end. Some unicode characters normalize to chevrons; this allows specially crafted HTML to escape sanitization. The problem has been fixed in 2.4.2.
html-sanitizer is an allowlist-based HTML cleaner. If using `keep_typo ...
EPSS