Описание
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.5.2 and prior to 0.8.5 are vulnerable to denial of service and data exposure via ZeroMQ on multi-node vLLM deployment. In a multi-node vLLM deployment, vLLM uses ZeroMQ for some multi-node communication purposes. The primary vLLM host opens an XPUB ZeroMQ socket and binds it to ALL interfaces. While the socket is always opened for a multi-node deployment, it is only used when doing tensor parallelism across multiple hosts. Any client with network access to this host can connect to this XPUB socket unless its port is blocked by a firewall. Once connected, these arbitrary clients will receive all of the same data broadcasted to all of the secondary vLLM hosts. This data is internal vLLM state information that is not useful to an attacker. By potentially connecting to this socket many times and not reading data published to them, an attacker can also cause a denial of service by slowing down or potentially blocking the publisher. This issue has been patched in version 0.8.5.
A flaw was found in vLLM's multi-node setup, which exposes sensitive data over a ZeroMQ XPUB socket bound to all interfaces. This vulnerability allows unauthorized clients to intercept and read internal communications if they can access the network.
Отчет
No Red Hat products are affected by these vulnerabilities. These vulnerabilities are marked as a high severity due to their direct impact on security and data confidentiality in multi-node vLLM deployments.The exposure of sensitive request data through an XPUB ZeroMQ socket bound to all network interfaces creates a risk of data leaks. Unauthorized clients with network access can passively listen to internal inference request traffic, revealing private model interactions, queries, and potentially sensitive data. Given the widespread impact on data confidentiality, this issue is classified as an Important severity rather than Moderate.
Меры по смягчению последствий
Mitigation for this issue is either not available or the currently available options do not meet the Red Hat Product Security criteria comprising ease of use and deployment, applicability to widespread installation base or stability.
Затронутые пакеты
Платформа | Пакет | Состояние | Рекомендация | Релиз |
---|---|---|---|---|
Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-amd-rhel9 | Not affected | ||
Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-aws-nvidia-rhel9 | Not affected | ||
Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-azure-amd-rhel9 | Not affected | ||
Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-azure-nvidia-rhel9 | Not affected | ||
Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-gcp-nvidia-rhel9 | Not affected | ||
Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-ibm-nvidia-rhel9 | Not affected | ||
Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-intel-rhel9 | Not affected | ||
Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/bootc-nvidia-rhel9 | Not affected | ||
Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/instructlab-amd-rhel9 | Not affected | ||
Red Hat Enterprise Linux AI (RHEL AI) | rhelai1/instructlab-intel-rhel9 | Not affected |
Показывать по
Дополнительная информация
Статус:
EPSS
7.5 High
CVSS3
Связанные уязвимости
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.5.2 and prior to 0.8.5 are vulnerable to denial of service and data exposure via ZeroMQ on multi-node vLLM deployment. In a multi-node vLLM deployment, vLLM uses ZeroMQ for some multi-node communication purposes. The primary vLLM host opens an XPUB ZeroMQ socket and binds it to ALL interfaces. While the socket is always opened for a multi-node deployment, it is only used when doing tensor parallelism across multiple hosts. Any client with network access to this host can connect to this XPUB socket unless its port is blocked by a firewall. Once connected, these arbitrary clients will receive all of the same data broadcasted to all of the secondary vLLM hosts. This data is internal vLLM state information that is not useful to an attacker. By potentially connecting to this socket many times and not reading data published to them, an attacker can also cause a denial of service by
vLLM is a high-throughput and memory-efficient inference and serving e ...
Data exposure via ZeroMQ on multi-node vLLM deployment
EPSS
7.5 High
CVSS3