Описание
Unstructured has Path Traversal via Malicious MSG Attachment that Allows Arbitrary File Write
A Path Traversal vulnerability in the partition_msg function allows an attacker to write or overwrite arbitrary files on the filesystem when processing malicious MSG files with attachments.
Impact
An attacker can craft a malicious .msg file with attachment filenames containing path traversal sequences (e.g.,
../../../etc/cron.d/malicious). When processed with process_attachments=True, the library writes the attachment to an
attacker-controlled path, potentially leading to:
- Arbitrary file overwrite
- Remote code execution (via overwriting configuration files, cron jobs, or Python packages)
- Data corruption
- Denial of service
Affected Functionality
The vulnerability affects the MSG file partitioning functionality when process_attachments=True is enabled.
Vulnerability Details
The library does not sanitize attachment filenames in MSG files before using them in file write operations, allowing directory traversal sequences to escape the intended output directory.
Workarounds
Until patched, users can:
- Set
process_attachments=Falsewhen processing untrusted MSG files - Avoid processing MSG files from untrusted sources
- Implement additional filename validation before processing
Пакеты
unstructured
<= 0.18.17
0.18.18
Связанные уязвимости
The unstructured library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. Prior to version 0.18.18, a path traversal vulnerability in the partition_msg function allows an attacker to write or overwrite arbitrary files on the filesystem when processing malicious MSG files with attachments. This issue has been patched in version 0.18.18.