Описание
vLLM is an inference and serving engine for large language models (LLMs). Starting in version 0.10.1 and prior to version 0.14.0, vLLM loads Hugging Face auto_map dynamic modules during model resolution without gating on trust_remote_code, allowing attacker-controlled Python code in a model repo/path to execute at server startup. An attacker who can influence the model repo/path (local directory or remote Hugging Face repo) can achieve arbitrary code execution on the vLLM host during model load. This happens before any request handling and does not require API access. Version 0.14.0 fixes the issue.
A flaw was found in vLLM, an inference and serving engine for large language models (LLMs). This vulnerability allows a remote attacker to achieve arbitrary code execution on the vLLM host during model loading. This occurs because vLLM loads Hugging Face auto_map dynamic modules without properly validating the trust_remote_code setting. By influencing the model repository or path, an attacker can execute malicious Python code at server startup, even before any API requests are handled.
Отчет
This vulnerability is rated Important for Red Hat as vLLM, an inference and serving engine for large language models, is vulnerable to arbitrary code execution. An attacker influencing the model repository or path can execute malicious Python code during server startup, affecting vLLM versions 0.10.1 through 0.13.x.
Меры по смягчению последствий
To mitigate this issue, ensure that vLLM instances are configured to load models only from trusted and verified repositories. Restrict access to the model repository path to prevent unauthorized modification or introduction of malicious code. Implement strict access controls and integrity checks for all model sources.
Затронутые пакеты
| Платформа | Пакет | Состояние | Рекомендация | Релиз |
|---|---|---|---|---|
| Red Hat AI Inference Server | rhaiis/vllm-spyre-rhel9 | Affected | ||
| Red Hat AI Inference Server | rhaiis/vllm-tpu-rhel9 | Affected | ||
| Red Hat Enterprise Linux AI (RHEL AI) 3 | rhelai3/bootc-aws-cuda-rhel9 | Affected | ||
| Red Hat Enterprise Linux AI (RHEL AI) 3 | rhelai3/bootc-azure-cuda-rhel9 | Affected | ||
| Red Hat Enterprise Linux AI (RHEL AI) 3 | rhelai3/bootc-cuda-rhel9 | Affected | ||
| Red Hat Enterprise Linux AI (RHEL AI) 3 | rhelai3/bootc-gcp-cuda-rhel9 | Affected | ||
| Red Hat OpenShift AI (RHOAI) | rhoai/odh-kserve-agent-rhel9 | Not affected | ||
| Red Hat OpenShift AI (RHOAI) | rhoai/odh-kserve-controller-rhel9 | Not affected | ||
| Red Hat OpenShift AI (RHOAI) | rhoai/odh-kserve-router-rhel9 | Not affected | ||
| Red Hat OpenShift AI (RHOAI) | rhoai/odh-kserve-storage-initializer-rhel9 | Not affected |
Показывать по
Ссылки на источники
Дополнительная информация
Статус:
8.8 High
CVSS3
Связанные уязвимости
vLLM is an inference and serving engine for large language models (LLMs). Starting in version 0.10.1 and prior to version 0.14.0, vLLM loads Hugging Face `auto_map` dynamic modules during model resolution without gating on `trust_remote_code`, allowing attacker-controlled Python code in a model repo/path to execute at server startup. An attacker who can influence the model repo/path (local directory or remote Hugging Face repo) can achieve arbitrary code execution on the vLLM host during model load. This happens before any request handling and does not require API access. Version 0.14.0 fixes the issue.
vLLM is an inference and serving engine for large language models (LLM ...
vLLM affected by RCE via auto_map dynamic module loading during model initialization
8.8 High
CVSS3