Описание
lxml-html-clean has tag injection through default Cleaner configuration
Summary
The <base> tag passes through the default Cleaner configuration. While page_structure=True removes html, head, and title tags, there is no specific handling for <base>, allowing an attacker to inject it and hijack relative links on the page.
Details
The <base> tag is not currently in the page_structure kill set. Even though the specification says <base> must be inside <head>, browsers accept <base> tags outside of the head.
If an attacker injects a <base> tag, it changes the base URL for all relative URLs on the page (links, images, scripts) to a domain controlled by the attacker.
PoC
Impact
The injection of a <base> tag allows an attacker to hijack the resolution of all relative URLs on the page. This results in three critical attack vectors:
- Phishing & Redirection: Attackers can redirect user navigation (e.g.,
<a href="/login">) and form submissions (e.g.,<form action="/auth">) to an attacker-controlled domain, effectively stealing credentials or sensitive data without the user realizing they have left the legitimate site. - Cross-Site Scripting (XSS): If the victim application loads JavaScript files using relative paths (e.g.,
<script src="assets/app.js">), the browser will attempt to fetch the script from the attacker's domain. This upgrades the vulnerability from HTML injection to full Stored XSS. - Defacement: Relative references to images (
<img>) and stylesheets (<link>) will be loaded from the attacker's server, allowing for UI redressing or defacement.
Пакеты
lxml-html-clean
<= 0.4.3
0.4.4
Связанные уязвимости
lxml_html_clean is a project for HTML cleaning functionalities copied from `lxml.html.clean`. Prior to version 0.4.4, the <base> tag passes through the default Cleaner configuration. While page_structure=True removes html, head, and title tags, there is no specific handling for <base>, allowing an attacker to inject it and hijack relative links on the page. This issue has been patched in version 0.4.4.
lxml_html_clean is a project for HTML cleaning functionalities copied from `lxml.html.clean`. Prior to version 0.4.4, the <base> tag passes through the default Cleaner configuration. While page_structure=True removes html, head, and title tags, there is no specific handling for <base>, allowing an attacker to inject it and hijack relative links on the page. This issue has been patched in version 0.4.4.
lxml_html_clean is a project for HTML cleaning functionalities copied ...