Описание
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.10.1.1, a Denial of Service (DoS) vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an HTTP endpoint. This results in server memory exhaustion, potentially leading to a crash or unresponsiveness. The attack does not require authentication, making it exploitable by any remote user. This vulnerability is fixed in 0.10.1.1.
A flaw was found in vLLM. A denial of service (DoS) vulnerability can be triggered by sending a single HTTP GET request with an extremely large X-Forwarded-For header to an HTTP endpoint. This results in server memory exhaustion, potentially leading to a crash or unresponsiveness. The attack does not require authentication, making it exploitable by any remote user.
Отчет
This vulnerability is considered Important rather than just Moderate because it enables a complete denial of service with minimal effort from a remote, unauthenticated attacker. Unlike moderate flaws that might require specific conditions, partial access, or complex exploitation chains, here a single oversized HTTP request is sufficient to exhaust server memory and crash the vLLM service. Since vLLM is often deployed as a backend for high-availability inference workloads, this creates a high-impact risk: availability is entirely compromised, all running workloads are disrupted, and recovery may require manual intervention. The lack of authentication barriers makes the attack surface fully exposed over the network, which elevates the severity beyond Moderate to Important.
Меры по смягчению последствий
Until a fix is available, the risk can be reduced by running vLLM behind a reverse proxy such as Nginx, Envoy, or HAProxy with strict header size limits, ensuring that oversized requests are dropped before reaching the service. Additional safeguards like container or VM resource limits and traffic monitoring can help contain the impact, but upgrading to the patched release remains the definitive solution.
Затронутые пакеты
| Платформа | Пакет | Состояние | Рекомендация | Релиз |
|---|---|---|---|---|
| Red Hat AI Inference Server | rhaiis/vllm-cuda-rhel9 | Affected | ||
| Red Hat AI Inference Server | rhaiis/vllm-rocm-rhel9 | Affected | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-amd-rhel9 | Affected | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-aws-nvidia-rhel9 | Affected | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-azure-amd-rhel9 | Affected | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-azure-nvidia-rhel9 | Affected | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-gcp-nvidia-rhel9 | Affected | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-intel-rhel9 | Affected | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-nvidia-rhel9 | Affected | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/disk-image-nvidia-rhel9 | Affected |
Показывать по
Дополнительная информация
Статус:
7.5 High
CVSS3
Связанные уязвимости
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.10.1.1, a Denial of Service (DoS) vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an HTTP endpoint. This results in server memory exhaustion, potentially leading to a crash or unresponsiveness. The attack does not require authentication, making it exploitable by any remote user. This vulnerability is fixed in 0.10.1.1.
vLLM is an inference and serving engine for large language models (LLM ...
vllm API endpoints vulnerable to Denial of Service Attacks
7.5 High
CVSS3