CVE-2025-62426

Опубликовано: 21 нояб. 2025

Источник: redhat

CVSS3: 6.5

EPSS Низкий

Описание

vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat template. With the right chat_template_kwargs parameters, it is possible to block processing of the API server for long periods of time, delaying all other requests. This issue has been patched in version 0.11.1.

A vulnerability in vLLM allows an authenticated user to trigger unintended tokenization during chat template processing by supplying crafted chat_template_kwargs to the /v1/chat/completions or /tokenize endpoints. By forcing the server to tokenize very large inputs, an attacker can block the API server’s event loop for extended periods, causing a denial of service and delaying all other requests.

Отчет

The flaw is limited to a denial-of-service vector that requires an authenticated user and relies on abusing an optional, non-security-critical parameter (chat_template_kwargs) to force unexpected tokenization during template application, which is computationally expensive but not indicative of data corruption, privilege escalation, or code execution. The attacker cannot break isolation boundaries or execute arbitrary logic—they can only cause the server’s event loop to stall through large crafted inputs, and only if they already have access to the vLLM API. Moreover, the DoS condition is resource-intensive, depends heavily on model size and server configuration, and does not persist once the malicious request completes. Because the impact is bounded to temporary availability degradation without confidentiality or integrity loss, and because exploitation requires legitimate API access and large payloads, this issue aligns with a Moderate severity rather than an Important/High flaw.

Меры по смягчению последствий

No mitigation is currently available that meets Red Hat Product Security’s standards for usability, deployment, applicability, or stability.

Затронутые пакеты

Платформа	Пакет	Состояние
Red Hat AI Inference Server	rhaiis/vllm-spyre-rhel9	Fix deferred
Red Hat AI Inference Server	rhaiis/vllm-tpu-rhel9	Fix deferred
Red Hat Enterprise Linux AI (RHEL AI)	rhelai1/bootc-amd-rhel9	Fix deferred
Red Hat Enterprise Linux AI (RHEL AI)	rhelai1/bootc-aws-nvidia-rhel9	Fix deferred
Red Hat Enterprise Linux AI (RHEL AI)	rhelai1/bootc-azure-amd-rhel9	Fix deferred
Red Hat Enterprise Linux AI (RHEL AI)	rhelai1/bootc-azure-nvidia-rhel9	Fix deferred
Red Hat Enterprise Linux AI (RHEL AI)	rhelai1/bootc-gcp-nvidia-rhel9	Fix deferred
Red Hat Enterprise Linux AI (RHEL AI)	rhelai1/bootc-intel-rhel9	Fix deferred
Red Hat Enterprise Linux AI (RHEL AI)	rhelai1/bootc-nvidia-rhel9	Fix deferred
Red Hat Enterprise Linux AI (RHEL AI)	rhelai1/instructlab-amd-rhel9	Fix deferred

Показывать по

Ссылки на источники

Дополнительная информация

Статус:

Moderate

Дефект:

CWE-770

https://bugzilla.redhat.com/show_bug.cgi?id=2416278vllm: vLLM vulnerable to DoS via large Chat Completion or Tokenization requests with specially crafted `chat_template_kwargs`

EPSS

Процентиль: 25%

0.00087

Низкий

6.5 Medium

CVSS3

Связанные уязвимости

CVE-2025-62426

CVSS3: 6.5

nvd

5 месяцев назад

CVE-2025-62426

CVSS3: 6.5

debian

5 месяцев назад

vLLM is an inference and serving engine for large language models (LLM ...

GHSA-69j4-grxj-j64p

CVSS3: 6.5

github

5 месяцев назад

vLLM vulnerable to DoS via large Chat Completion or Tokenization requests with specially crafted `chat_template_kwargs`

BDU:2025-14680

CVSS3: 6.5

fstec

5 месяцев назад

Уязвимость библиотеки для работы с большими языковыми моделями (LLM) vLLM, связанная с неограниченным распределением ресурсов, позволяющая нарушителю вызвать отказ в обслуживании

EPSS

Процентиль: 25%

0.00087

Низкий

6.5 Medium

CVSS3