Описание
Security update for slurm_24_11
This update for slurm_24_11 fixes the following issues:
Update to version 24.11.5.
Security issues fixed:
- CVE-2025-43904: an issue with permission handling for Coordinators within the accounting system allowed Coordinators to promote a user to Administrator (bsc#1243666).
Other changes and issues fixed:
-
Changes from version 24.11.5
- Return error to
scontrolreboot on bad nodelists. slurmrestd- Report an error when QOS resolution fails for v0.0.40 endpoints.slurmrestd- Report an error when QOS resolution fails for v0.0.41 endpoints.slurmrestd- Report an error when QOS resolution fails for v0.0.42 endpoints.data_parser/v0.0.42- Added+inline_enumsflag which modifies the output when generating OpenAPI specification. It causes enum arrays to not be defined in their own schema with references ($ref) to them. Instead they will be dumped inline.- Fix binding error with
tres-bind map/maskon partial node allocations. - Fix
stepmgrenabled steps being able to request features. - Reject step creation if requested feature is not available in job.
slurmd- Restrict listening for new incoming RPC requests further into startup.slurmd- Avoidauth/slurmrelated hangs of CLI commands during startup and shutdown.slurmctld- Restrict processing new incoming RPC requests further into startup. Stop processing requests sooner during shutdown.slurmcltd- Avoid auth/slurm related hangs of CLI commands during startup and shutdown.slurmctld- Avoid race condition during shutdown or ereconfigure that could result in a crash due delayed processing of a connection while plugins are unloaded.- Fix small memleak when getting the job list from the database.
- Fix incorrect printing of
%escape characters when printing stdio fields for jobs. - Fix padding parsing when printing stdio fields for jobs.
- Fix printing
%Aarray job id when expanding patterns. - Fix reservations causing jobs to be held for
Bad Constraints. switch/hpe_slingshot- Prevent potential segfault on failed curl request to the fabric manager.- Fix printing incorrect array job id when expanding stdio file
names. The
%Awill now be substituted by the correct value. - Fix printing incorrect array job id when expanding stdio file
names. The
%Awill now be substituted by the correct value. switch/hpe_slingshot- Fix VNI range not updating on slurmctld restart or reconfigre.- Fix steps not being created when using certain combinations of
-cand-ninferior to the jobs requested resources, when using stepmgr and nodes are configured withCPUs == Sockets*CoresPerSocket. - Permit configuring the number of retry attempts to destroy CXI
service via the new destroy_retries
SwitchParameter. - Do not reset
memory.highandmemory.swap.maxin slurmd startup or reconfigure as we are never really touching this inslurmd. - Fix reconfigure failure of slurmd when it has been started
manually and the
CoreSpecLimitshave been removed fromslurm.conf. - Set or reset CoreSpec limits when slurmd is reconfigured and it was started with systemd.
switch/hpe-slingshot- Make sure the slurmctld can free step VNIs after the controller restarts or reconfigures while the job is running.- Fix backup
slurmctldfailure on 2nd takeover.
- Return error to
-
Changes from version 24.11.4
slurmctld,slurmrestd- Avoid possible race condition that could have caused process to crash when listener socket was closed while accepting a new connection.slurmrestd- Avoid race condition that could have resulted in address logged for a UNIX socket to be incorrect.slurmrestd- Fix parameters in OpenAPI specification for the following endpoints to havejob_idfield:GET /slurm/v0.0.40/jobs/state/ GET /slurm/v0.0.41/jobs/state/ GET /slurm/v0.0.42/jobs/state/ GET /slurm/v0.0.43/jobs/state/slurmd- Fix tracking of thread counts that could cause incoming connections to be ignored after burst of simultaneous incoming connections that trigger delayed response logic.- Avoid unnecessary
SRUN_TIMEOUTforwarding tostepmgr. - Fix jobs being scheduled on higher weighted powered down nodes.
- Fix how backfill scheduler filters nodes from the available
nodes based on exclusive user and
mcs_labelrequirements. acct_gather_energy/{gpu,ipmi}- Fix potential energy consumption adjustment calculation underflow.acct_gather_energy/ipmi- Fix regression introduced in 24.05.5 (which introduced the new way of preserving energy measurements through slurmd restarts) whenEnergyIPMICalcAdjustment=yes.- Prevent
slurmctlddeadlock in the assoc mgr. - Fix memory leak when
RestrictedCoresPerGPUis enabled. - Fix preemptor jobs not entering execution due to wrong calculation of accounting policy limits.
- Fix certain job requests that were incorrectly denied with node configuration unavailable error.
slurmd- Avoid crash due when slurmd has a communications failure withslurmstepd.- Fix memory leak when parsing yaml input.
- Prevent
slurmctldfrom showing error message aboutPreemptMode=GANGbeing a cluster-wide option forscontrol update partcalls that don't attempt to modify partition PreemptMode. - Fix setting
GANGpreemption on partition when updatingPreemptModewithscontrol. - Fix
CoreSpecandMemSpeclimits not being removed from previously configured slurmd. - Avoid race condition that could lead to a deadlock when
slurmd,slurmstepd,slurmctld,slurmrestdorsackdhave a fatal event. - Fix jobs using
--ntasks-per-nodeand--memkeep pending forever when the requested mem divided by the number of CPUs will surpass the configuredMaxMemPerCPU. slurmd- Fix address logged upon new incoming RPC connection fromINVALIDto IP address.- Fix memory leak when retrieving reservations. This affects
scontrol,sinfo,sview, and the followingslurmrestdendpoints:GET /slurm/{any_data_parser}/reservation/{reservation_name}GET /slurm/{any_data_parser}/reservations - Log warning instead of
debuflags=conmgrgated log when deferring new incoming connections when number of active connections exceedconmgr_max_connections. - Avoid race condition that could result in worker thread pool not activating all threads at once after a reconfigure resulting in lower utilization of available CPU threads until enough internal activity wakes up all threads in the worker pool.
- Avoid theoretical race condition that could result in new incoming RPC socket connections being ignored after reconfigure.
- slurmd - Avoid race condition that could result in a state where new incoming RPC connections will always be ignored.
- Add ReconfigFlags=KeepNodeStateFuture to restore saved
FUTUREnode state on restart and reconfig instead of reverting toFUTUREstate. This will be made the default in 25.05. - Fix case where hetjob submit would cause
slurmctldto crash. - Fix jobs using
--cpus-per-gpuand--memkeep pending forever when the requested mem divided by the number of CPUs will surpass the configuredMaxMemPerCPU. - Enforce that jobs using
--memand several--*-per-*options do not violate theMaxMemPerCPUin place. slurmctld- Fix use-cases of jobs incorrectly pending held when--preferfeatures are not initially satisfied.slurmctld- Fix jobs incorrectly held when--prefernot satisfied in some use-cases.- Ensure
RestrictedCoresPerGPUandCoreSpecCountdon't overlap.
-
Changes from version 24.11.3
- Fix database cluster ID generation not being random.
- Fix a regression in which
slurmd -Ggave no output. - Fix a long-standing crash in
slurmctldafter updating a reservation with an empty nodelist. The crash could occur after restarting slurmctld, or if downing/draining a node in the reservation with theREPLACEorREPLACE_DOWNflag. - Avoid changing process name to '
watch' from original daemon name. This could potentially breaking some monitoring scripts. - Avoid
slurmctldbeing killed bySIGALRMdue to race condition at startup. - Fix race condition in slurmrestd that resulted in '
Requested data_parser plugin does not support OpenAPI plugin' error being returned for valid endpoints. - Fix race between
task/cgroupCPUset andjobacctgather/cgroup. The first was removing the pid fromtask_Xcgroup directory causing memory limits to not being applied. - If multiple partitions are requested, set the
SLURM_JOB_PARTITIONoutput environment variable to the partition in which the job is running forsallocandsrunin order to match the documentation and the behavior ofsbatch. srun- Fixed wrongly constructedSLURM_CPU_BINDenv variable that could get propagated to downward srun calls in certain mpi environments, causing launch failures.- Don't print misleading errors for stepmgr enabled steps.
slurmrestd- Avoid connection to slurmdbd for the following endpoints:GET /slurm/v0.0.41/jobs GET /slurm/v0.0.41/job/{job_id}slurmrestd- Avoid connection to slurmdbd for the following endpoints:GET /slurm/v0.0.40/jobs GET /slurm/v0.0.40/job/{job_id}slurmrestd- Fix possible memory leak when parsing arrays withdata_parser/v0.0.40.slurmrestd- Fix possible memory leak when parsing arrays withdata_parser/v0.0.41.slurmrestd- Fix possible memory leak when parsing arrays withdata_parser/v0.0.42.
-
Changes from version 24.11.2
- Fix segfault when submitting
--test-onlyjobs that can preempt. - Fix regression introduced in 23.11 that prevented the
following flags from being added to a reservation on an
update:
DAILY,HOURLY,WEEKLY,WEEKDAY, andWEEKEND. - Fix crash and issues evaluating job's suitability for running in nodes with already suspended job(s) there.
slurmctldwill ensure that healthy nodes are not reported asUnavailableNodesin job reason codes.- Fix handling of jobs submitted to a current reservation with
flags
OVERLAP,FLEXorOVERLAP,ANY_NODESwhen it overlaps nodes with a future maintenance reservation. When a job submission had a time limit that overlapped with the future maintenance reservation, it was rejected. Now the job is accepted but stays pending with the reason 'ReqNodeNotAvail, Reserved for maintenance'. pam_slurm_adopt- avoid errors when explicitly setting some arguments to the default value.- Fix QOS preemption with
PreemptMode=SUSPEND. slurmdbd- When changing a user's name update lineage at the same time.- Fix regression in 24.11 in which
burst_buffer.luadoes not inherit theSLURM_CONFenvironment variable fromslurmctldand fails to run if slurm.conf is in a non-standard location. - Fix memory leak in slurmctld if
select/linearand thePreemptParameters=reclaim_licensesoptions are both set inslurm.conf. Regression in 24.11.1. - Fix running jobs, that requested multiple partitions, from potentially being set to the wrong partition on restart.
switch/hpe_slingshot- Fix compatibility with newer cxi drivers, specifically when specifyingdisable_rdzv_get.- Add
ABORT_ON_FATALenvironment variable to capture a backtrace from anyfatal()message. - Fix printing invalid address in rate limiting log statement.
sched/backfill- Fix node statePLANNEDnot being cleared from fully allocated nodes during a backfill cycle.select/cons_tres- Fix future planning of jobs withbf_licenses.- Prevent redundant '
on_data returned rc: Rate limit exceeded, please retry momentarily' error message from being printed in slurmctld logs. - Fix loading non-default QOS on pending jobs from pre-24.11 state.
- Fix pending jobs displaying
QOS=(null)when not explicitly requesting a QOS. - Fix segfault issue from job record with no
job_resrcs. - Fix failing
sacctmgr delete/modify/showaccount operations withwhereclauses. - Fix regression in 24.11 in which Slurm daemons started
catching several
SIGTSTP,SIGTTINandSIGUSR1signals and ignored them, while before they were not ignoring them. This also caused slurmctld to not being able to shutdown after aSIGTSTPbecause slurmscriptd caught the signal and stopped while slurmctld ignored it. Unify and fix these situations and get back to the previous behavior for these signals. - Document that
SIGQUITis no longer ignored byslurmctld,slurmdbd, and slurmd in 24.11. As of 24.11.0rc1,SIGQUITis identical toSIGINTandSIGTERMfor these daemons, but this change was not documented. - Fix not considering nodes marked for reboot without ASAP in the scheduler.
- Remove the
boot^state on unexpected node reboot after return to service. - Do not allow new jobs to start on a node which is being
rebooted with the flag
nextstate=resume. - Prevent lower priority job running after cancelling an ASAP reboot.
- Fix srun jobs starting on
nextstate=resumerebooting nodes.
- Fix segfault when submitting
Список пакетов
SUSE Linux Enterprise High Performance Computing 15 SP3-LTSS
libnss_slurm2_24_11-24.11.5-150300.7.8.1
libpmi0_24_11-24.11.5-150300.7.8.1
libslurm42-24.11.5-150300.7.8.1
perl-slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-auth-none-24.11.5-150300.7.8.1
slurm_24_11-config-24.11.5-150300.7.8.1
slurm_24_11-config-man-24.11.5-150300.7.8.1
slurm_24_11-cray-24.11.5-150300.7.8.1
slurm_24_11-devel-24.11.5-150300.7.8.1
slurm_24_11-doc-24.11.5-150300.7.8.1
slurm_24_11-lua-24.11.5-150300.7.8.1
slurm_24_11-munge-24.11.5-150300.7.8.1
slurm_24_11-node-24.11.5-150300.7.8.1
slurm_24_11-pam_slurm-24.11.5-150300.7.8.1
slurm_24_11-plugins-24.11.5-150300.7.8.1
slurm_24_11-rest-24.11.5-150300.7.8.1
slurm_24_11-slurmdbd-24.11.5-150300.7.8.1
slurm_24_11-sql-24.11.5-150300.7.8.1
slurm_24_11-sview-24.11.5-150300.7.8.1
slurm_24_11-torque-24.11.5-150300.7.8.1
slurm_24_11-webdoc-24.11.5-150300.7.8.1
SUSE Linux Enterprise High Performance Computing 15 SP4-ESPOS
libnss_slurm2_24_11-24.11.5-150300.7.8.1
libpmi0_24_11-24.11.5-150300.7.8.1
libslurm42-24.11.5-150300.7.8.1
perl-slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-auth-none-24.11.5-150300.7.8.1
slurm_24_11-config-24.11.5-150300.7.8.1
slurm_24_11-config-man-24.11.5-150300.7.8.1
slurm_24_11-cray-24.11.5-150300.7.8.1
slurm_24_11-devel-24.11.5-150300.7.8.1
slurm_24_11-doc-24.11.5-150300.7.8.1
slurm_24_11-lua-24.11.5-150300.7.8.1
slurm_24_11-munge-24.11.5-150300.7.8.1
slurm_24_11-node-24.11.5-150300.7.8.1
slurm_24_11-pam_slurm-24.11.5-150300.7.8.1
slurm_24_11-plugins-24.11.5-150300.7.8.1
slurm_24_11-rest-24.11.5-150300.7.8.1
slurm_24_11-slurmdbd-24.11.5-150300.7.8.1
slurm_24_11-sql-24.11.5-150300.7.8.1
slurm_24_11-sview-24.11.5-150300.7.8.1
slurm_24_11-torque-24.11.5-150300.7.8.1
slurm_24_11-webdoc-24.11.5-150300.7.8.1
SUSE Linux Enterprise High Performance Computing 15 SP4-LTSS
libnss_slurm2_24_11-24.11.5-150300.7.8.1
libpmi0_24_11-24.11.5-150300.7.8.1
libslurm42-24.11.5-150300.7.8.1
perl-slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-auth-none-24.11.5-150300.7.8.1
slurm_24_11-config-24.11.5-150300.7.8.1
slurm_24_11-config-man-24.11.5-150300.7.8.1
slurm_24_11-cray-24.11.5-150300.7.8.1
slurm_24_11-devel-24.11.5-150300.7.8.1
slurm_24_11-doc-24.11.5-150300.7.8.1
slurm_24_11-lua-24.11.5-150300.7.8.1
slurm_24_11-munge-24.11.5-150300.7.8.1
slurm_24_11-node-24.11.5-150300.7.8.1
slurm_24_11-pam_slurm-24.11.5-150300.7.8.1
slurm_24_11-plugins-24.11.5-150300.7.8.1
slurm_24_11-rest-24.11.5-150300.7.8.1
slurm_24_11-slurmdbd-24.11.5-150300.7.8.1
slurm_24_11-sql-24.11.5-150300.7.8.1
slurm_24_11-sview-24.11.5-150300.7.8.1
slurm_24_11-torque-24.11.5-150300.7.8.1
slurm_24_11-webdoc-24.11.5-150300.7.8.1
SUSE Linux Enterprise High Performance Computing 15 SP5-ESPOS
libnss_slurm2_24_11-24.11.5-150300.7.8.1
libpmi0_24_11-24.11.5-150300.7.8.1
libslurm42-24.11.5-150300.7.8.1
perl-slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-auth-none-24.11.5-150300.7.8.1
slurm_24_11-config-24.11.5-150300.7.8.1
slurm_24_11-config-man-24.11.5-150300.7.8.1
slurm_24_11-cray-24.11.5-150300.7.8.1
slurm_24_11-devel-24.11.5-150300.7.8.1
slurm_24_11-doc-24.11.5-150300.7.8.1
slurm_24_11-lua-24.11.5-150300.7.8.1
slurm_24_11-munge-24.11.5-150300.7.8.1
slurm_24_11-node-24.11.5-150300.7.8.1
slurm_24_11-pam_slurm-24.11.5-150300.7.8.1
slurm_24_11-plugins-24.11.5-150300.7.8.1
slurm_24_11-rest-24.11.5-150300.7.8.1
slurm_24_11-slurmdbd-24.11.5-150300.7.8.1
slurm_24_11-sql-24.11.5-150300.7.8.1
slurm_24_11-sview-24.11.5-150300.7.8.1
slurm_24_11-torque-24.11.5-150300.7.8.1
slurm_24_11-webdoc-24.11.5-150300.7.8.1
SUSE Linux Enterprise High Performance Computing 15 SP5-LTSS
libnss_slurm2_24_11-24.11.5-150300.7.8.1
libpmi0_24_11-24.11.5-150300.7.8.1
libslurm42-24.11.5-150300.7.8.1
perl-slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-auth-none-24.11.5-150300.7.8.1
slurm_24_11-config-24.11.5-150300.7.8.1
slurm_24_11-config-man-24.11.5-150300.7.8.1
slurm_24_11-cray-24.11.5-150300.7.8.1
slurm_24_11-devel-24.11.5-150300.7.8.1
slurm_24_11-doc-24.11.5-150300.7.8.1
slurm_24_11-lua-24.11.5-150300.7.8.1
slurm_24_11-munge-24.11.5-150300.7.8.1
slurm_24_11-node-24.11.5-150300.7.8.1
slurm_24_11-pam_slurm-24.11.5-150300.7.8.1
slurm_24_11-plugins-24.11.5-150300.7.8.1
slurm_24_11-rest-24.11.5-150300.7.8.1
slurm_24_11-slurmdbd-24.11.5-150300.7.8.1
slurm_24_11-sql-24.11.5-150300.7.8.1
slurm_24_11-sview-24.11.5-150300.7.8.1
slurm_24_11-torque-24.11.5-150300.7.8.1
slurm_24_11-webdoc-24.11.5-150300.7.8.1
SUSE Linux Enterprise Module for HPC 15 SP6
libnss_slurm2_24_11-24.11.5-150300.7.8.1
libpmi0_24_11-24.11.5-150300.7.8.1
libslurm42-24.11.5-150300.7.8.1
perl-slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-auth-none-24.11.5-150300.7.8.1
slurm_24_11-config-24.11.5-150300.7.8.1
slurm_24_11-config-man-24.11.5-150300.7.8.1
slurm_24_11-cray-24.11.5-150300.7.8.1
slurm_24_11-devel-24.11.5-150300.7.8.1
slurm_24_11-doc-24.11.5-150300.7.8.1
slurm_24_11-lua-24.11.5-150300.7.8.1
slurm_24_11-munge-24.11.5-150300.7.8.1
slurm_24_11-node-24.11.5-150300.7.8.1
slurm_24_11-pam_slurm-24.11.5-150300.7.8.1
slurm_24_11-plugins-24.11.5-150300.7.8.1
slurm_24_11-rest-24.11.5-150300.7.8.1
slurm_24_11-slurmdbd-24.11.5-150300.7.8.1
slurm_24_11-sql-24.11.5-150300.7.8.1
slurm_24_11-sview-24.11.5-150300.7.8.1
slurm_24_11-torque-24.11.5-150300.7.8.1
slurm_24_11-webdoc-24.11.5-150300.7.8.1
openSUSE Leap 15.6
libnss_slurm2_24_11-24.11.5-150300.7.8.1
libpmi0_24_11-24.11.5-150300.7.8.1
libslurm42-24.11.5-150300.7.8.1
perl-slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-24.11.5-150300.7.8.1
slurm_24_11-auth-none-24.11.5-150300.7.8.1
slurm_24_11-config-24.11.5-150300.7.8.1
slurm_24_11-config-man-24.11.5-150300.7.8.1
slurm_24_11-cray-24.11.5-150300.7.8.1
slurm_24_11-devel-24.11.5-150300.7.8.1
slurm_24_11-doc-24.11.5-150300.7.8.1
slurm_24_11-hdf5-24.11.5-150300.7.8.1
slurm_24_11-lua-24.11.5-150300.7.8.1
slurm_24_11-munge-24.11.5-150300.7.8.1
slurm_24_11-node-24.11.5-150300.7.8.1
slurm_24_11-openlava-24.11.5-150300.7.8.1
slurm_24_11-pam_slurm-24.11.5-150300.7.8.1
slurm_24_11-plugins-24.11.5-150300.7.8.1
slurm_24_11-rest-24.11.5-150300.7.8.1
slurm_24_11-seff-24.11.5-150300.7.8.1
slurm_24_11-sjstat-24.11.5-150300.7.8.1
slurm_24_11-slurmdbd-24.11.5-150300.7.8.1
slurm_24_11-sql-24.11.5-150300.7.8.1
slurm_24_11-sview-24.11.5-150300.7.8.1
slurm_24_11-testsuite-24.11.5-150300.7.8.1
slurm_24_11-torque-24.11.5-150300.7.8.1
slurm_24_11-webdoc-24.11.5-150300.7.8.1
Ссылки
- Link for SUSE-SU-2025:01761-1
- E-Mail link for SUSE-SU-2025:01761-1
- SUSE Security Ratings
- SUSE Bug 1243666
- SUSE CVE CVE-2025-43904 page
Описание
** RESERVED ** This candidate has been reserved by an organization or individual that will use it when announcing a new security problem. When the candidate has been publicized, the details for this candidate will be provided.
Затронутые продукты
SUSE Linux Enterprise High Performance Computing 15 SP3-LTSS:libnss_slurm2_24_11-24.11.5-150300.7.8.1
SUSE Linux Enterprise High Performance Computing 15 SP3-LTSS:libpmi0_24_11-24.11.5-150300.7.8.1
SUSE Linux Enterprise High Performance Computing 15 SP3-LTSS:libslurm42-24.11.5-150300.7.8.1
SUSE Linux Enterprise High Performance Computing 15 SP3-LTSS:perl-slurm_24_11-24.11.5-150300.7.8.1
Ссылки
- CVE-2025-43904
- SUSE Bug 1243666