Описание
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, users can crash the vLLM engine serving multimodal models by passing multimodal embedding inputs with correct ndim but incorrect shape (e.g. hidden dimension is wrong), regardless of whether the model is intended to support such inputs (as defined in the Supported Models page). This issue has been patched in version 0.11.1.
A denial-of-service vulnerability in vLLM allows an attacker with API access to crash the engine by submitting multimodal embedding tensors that have the correct number of dimensions but an invalid internal shape. Because vLLM validates only the tensor’s ndim and not the full expected shape, malformed embeddings trigger shape mismatches or validation failures during processing, causing the inference engine to terminate.
Отчет
This flaw is rated Moderate rather than Important because its impact is strictly limited to availability and requires low but existing privileges to exploit. The issue arises from incomplete shape validation of multimodal embedding tensors, which can cause deterministic crashes in the inference engine, but it does not enable memory corruption, data leakage, integrity compromise, or execution of arbitrary code. Exploitation requires an authenticated or API-key-holding user to submit malformed multimodal inputs, meaning it cannot be triggered by an unauthenticated attacker on an exposed endpoint. Additionally, the failure mode is a clean crash rather than undefined behavior, so the blast radius is constrained to service interruption rather than broader systemic compromise. These factors—PR:L requirement, no confidentiality/integrity impact, deterministic failure mode, and scoped DoS only—technically align the issue with Moderate severity instead of an Important flaw.
Меры по смягчению последствий
No mitigation is currently available that meets Red Hat Product Security’s standards for usability, deployment, applicability, or stability.
Затронутые пакеты
| Платформа | Пакет | Состояние | Рекомендация | Релиз |
|---|---|---|---|---|
| Red Hat AI Inference Server | rhaiis/vllm-spyre-rhel9 | Fix deferred | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-amd-rhel9 | Fix deferred | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-aws-nvidia-rhel9 | Fix deferred | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-azure-amd-rhel9 | Fix deferred | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-azure-nvidia-rhel9 | Fix deferred | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-gcp-nvidia-rhel9 | Fix deferred | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-intel-rhel9 | Fix deferred | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-nvidia-rhel9 | Fix deferred | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/instructlab-amd-rhel9 | Fix deferred | ||
| Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/instructlab-intel-rhel9 | Fix deferred |
Показывать по
Ссылки на источники
Дополнительная информация
Статус:
EPSS
6.5 Medium
CVSS3
Связанные уязвимости
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, users can crash the vLLM engine serving multimodal models by passing multimodal embedding inputs with correct ndim but incorrect shape (e.g. hidden dimension is wrong), regardless of whether the model is intended to support such inputs (as defined in the Supported Models page). This issue has been patched in version 0.11.1.
vLLM is an inference and serving engine for large language models (LLM ...
vLLM vulnerable to DoS with incorrect shape of multimodal embedding inputs
Уязвимость библиотеки для работы с большими языковыми моделями (LLM) vLLM, связанная с непроверенным индексированием массива, позволяющая нарушителю вызвать отказ в обслуживании
EPSS
6.5 Medium
CVSS3