Описание
In the Linux kernel, the following vulnerability has been resolved:
drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb
The Mesa issue referenced below pointed out a possible deadlock:
[ 1231.611031] Possible interrupt unsafe locking scenario:
[ 1231.611033] CPU0 CPU1
[ 1231.611034] ---- ----
[ 1231.611035] lock(&xa->xa_lock#17);
[ 1231.611038] local_irq_disable();
[ 1231.611039] lock(&fence->lock);
[ 1231.611041] lock(&xa->xa_lock#17);
[ 1231.611044]
[ 1231.611045] lock(&fence->lock);
[ 1231.611047]
*** DEADLOCK ***
In this example, CPU0 would be any function accessing job->dependencies
through the xa_* functions that don't disable interrupts (eg:
drm_sched_job_add_dependency(), drm_sched_entity_kill_jobs_cb()).
CPU1 is executing drm_sched_entity_kill_jobs_cb() as a fence signalling
callback so in an interrupt context. It will deadlock when trying to
grab the xa_lock which is already held by CPU0.
Replacing all xa_* usage by their xa_*_irq counterparts would fix
this issue, but Christian pointed out another issue: dma_fence_signal
takes fence.lock and so does dma_fence_add_callback.
dma_fence_signal() // locks f1.lock
-> drm_sched_entity_kill_jobs_cb()
-> foreach dependencies
-> dma_fence_add_callback() // locks f2.lock
This will deadlock if f1 and f2 share the same spinlock.
To fix both issues, the code iterating on dependencies and re-arming them
is moved out to drm_sched_entity_kill_jobs_work().
[phasta: commit message nits]
A deadlock flaw was found in the Linux kernel's DRM GPU scheduler. In drm_sched_entity_kill_jobs_cb(), improper locking order between the xarray lock (xa_lock) and fence lock can cause a deadlock. When CPU0 holds xa_lock and an interrupt triggers on CPU1 that requires both fence->lock and xa_lock, the system deadlocks. Additionally, if two fences share the same spinlock, calling dma_fence_add_callback() from within dma_fence_signal() creates another deadlock scenario. A local user with access to GPU resources could trigger this condition, causing a system hang.
Отчет
This vulnerability affects systems with DRM-based GPU drivers that use the kernel's GPU scheduler. Exploitation requires local access and the ability to submit GPU workloads. The deadlock results in a denial of service requiring a system reboot to recover.
Затронутые пакеты
| Платформа | Пакет | Состояние | Рекомендация | Релиз |
|---|---|---|---|---|
| Red Hat Enterprise Linux 10 | kernel | Fix deferred | ||
| Red Hat Enterprise Linux 6 | kernel | Not affected | ||
| Red Hat Enterprise Linux 7 | kernel | Not affected | ||
| Red Hat Enterprise Linux 7 | kernel-rt | Not affected | ||
| Red Hat Enterprise Linux 8 | kernel | Fix deferred | ||
| Red Hat Enterprise Linux 8 | kernel-rt | Fix deferred | ||
| Red Hat Enterprise Linux 9 | kernel | Fix deferred | ||
| Red Hat Enterprise Linux 9 | kernel-rt | Fix deferred |
Показывать по
Дополнительная информация
Статус:
EPSS
5.8 Medium
CVSS3
Связанные уязвимости
In the Linux kernel, the following vulnerability has been resolved: drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb The Mesa issue referenced below pointed out a possible deadlock: [ 1231.611031] Possible interrupt unsafe locking scenario: [ 1231.611033] CPU0 CPU1 [ 1231.611034] ---- ---- [ 1231.611035] lock(&xa->xa_lock#17); [ 1231.611038] local_irq_disable(); [ 1231.611039] lock(&fence->lock); [ 1231.611041] lock(&xa->xa_lock#17); [ 1231.611044] <Interrupt> [ 1231.611045] lock(&fence->lock); [ 1231.611047] *** DEADLOCK *** In this example, CPU0 would be any function accessing job->dependencies through the xa_* functions that don't disable interrupts (eg: drm_sched_job_add_dependency(), drm_sched_entity_kill_jobs_cb()). CPU1 is executing drm_sched_entity_kill_jobs_cb() as a fence signalling callback so in an interrupt con...
In the Linux kernel, the following vulnerability has been resolved: drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb The Mesa issue referenced below pointed out a possible deadlock: [ 1231.611031] Possible interrupt unsafe locking scenario: [ 1231.611033] CPU0 CPU1 [ 1231.611034] ---- ---- [ 1231.611035] lock(&xa->xa_lock#17); [ 1231.611038] local_irq_disable(); [ 1231.611039] lock(&fence->lock); [ 1231.611041] lock(&xa->xa_lock#17); [ 1231.611044] <Interrupt> [ 1231.611045] lock(&fence->lock); [ 1231.611047] *** DEADLOCK *** In this example, CPU0 would be any function accessing job->dependencies through the xa_* functions that don't disable interrupts (eg: drm_sched_job_add_dependency(), drm_sched_entity_kill_jobs_cb()). CPU1 is executing drm_sched_entity_kill_jobs_cb() as a fence signalling callback so
In the Linux kernel, the following vulnerability has been resolved: d ...
In the Linux kernel, the following vulnerability has been resolved: drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb The Mesa issue referenced below pointed out a possible deadlock: [ 1231.611031] Possible interrupt unsafe locking scenario: [ 1231.611033] CPU0 CPU1 [ 1231.611034] ---- ---- [ 1231.611035] lock(&xa->xa_lock#17); [ 1231.611038] local_irq_disable(); [ 1231.611039] lock(&fence->lock); [ 1231.611041] lock(&xa->xa_lock#17); [ 1231.611044] <Interrupt> [ 1231.611045] lock(&fence->lock); [ 1231.611047] *** DEADLOCK *** In this example, CPU0 would be any function accessing job->dependencies through the xa_* functions that don't disable interrupts (eg: drm_sched_job_add_dependency(), drm_sched_entity_kill_jobs_cb()). CPU1 is executing drm_sched_entity_kill_jobs_cb() as a fence signalling callback ...
EPSS
5.8 Medium
CVSS3