Flash-attention's checkpoint loading functions call torch.load() without the weights_only=True safeguard, allowing arbitrary Python objects to be deserialized via pickle — a well-documented attack primitive that trivially maps to full code execution on the victim's machine. Any team loading external or shared checkpoints during model warmstarting or evaluation is directly exposed: an attacker who can supply a malicious checkpoint file through a compromised model registry, poisoned shared storage, or targeted file delivery achieves arbitrary code execution in the context of the training job. While no public exploit or CISA KEV entry exists yet, pickle deserialization payloads require minimal skill to craft and public tooling automates the process. Patch to a commit past e724e2588c, enforce weights_only=True across all torch.load() calls, migrate to safetensors format where feasible, and scan ingested checkpoints with fickling or modelscan in CI before loading.
What is the risk?
Risk is HIGH despite absent CVSS metrics. Pickle deserialization exploits are trivial to construct — public tooling such as fickling and modelscan exist specifically for this class, meaning the attacker skill bar is low. The vulnerability sits in the training and evaluation path, a context that is typically highly privileged: GPU workers frequently carry attached cloud IAM roles, access to internal APIs, and raw training data. Collaborative ML environments with shared NFS mounts or model registries dramatically amplify blast radius. The primary mitigating factor is that exploitation requires controlling a checkpoint file the victim loads, limiting opportunistic mass exploitation but leaving targeted attacks highly practical.
How does the attack unfold?
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| PyTorch | pip | <= 2.8.3 | No patch |
Do you use PyTorch? You're affected.
How severe is it?
What is the attack surface?
What should I do?
7 steps-
Update flash-attention to a commit past e724e2588cbe754beb97cf7c011b5e7e34119e62.
-
Audit all torch.load() calls across training and evaluation scripts and enforce weights_only=True on every call.
-
Migrate checkpoint serialization to safetensors format where possible — safetensors is structurally immune to pickle-based deserialization attacks.
-
Implement integrity verification for all externally sourced checkpoints using SHA-256 checksums with out-of-band signature distribution.
-
Integrate fickling or modelscan checkpoint scanning into CI/CD pipelines before any checkpoint is loaded in training or evaluation.
-
Restrict training job network access and file system permissions to limit post-exploitation lateral movement.
-
Treat any checkpoint from an unverified source as untrusted until cryptographically validated.
What does CISA's SSVC say?
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2026-31253?
Flash-attention's checkpoint loading functions call torch.load() without the weights_only=True safeguard, allowing arbitrary Python objects to be deserialized via pickle — a well-documented attack primitive that trivially maps to full code execution on the victim's machine. Any team loading external or shared checkpoints during model warmstarting or evaluation is directly exposed: an attacker who can supply a malicious checkpoint file through a compromised model registry, poisoned shared storage, or targeted file delivery achieves arbitrary code execution in the context of the training job. While no public exploit or CISA KEV entry exists yet, pickle deserialization payloads require minimal skill to craft and public tooling automates the process. Patch to a commit past e724e2588c, enforce weights_only=True across all torch.load() calls, migrate to safetensors format where feasible, and scan ingested checkpoints with fickling or modelscan in CI before loading.
Is CVE-2026-31253 actively exploited?
No confirmed active exploitation of CVE-2026-31253 has been reported, but organizations should still patch proactively.
How to fix CVE-2026-31253?
1. Update flash-attention to a commit past e724e2588cbe754beb97cf7c011b5e7e34119e62. 2. Audit all torch.load() calls across training and evaluation scripts and enforce weights_only=True on every call. 3. Migrate checkpoint serialization to safetensors format where possible — safetensors is structurally immune to pickle-based deserialization attacks. 4. Implement integrity verification for all externally sourced checkpoints using SHA-256 checksums with out-of-band signature distribution. 5. Integrate fickling or modelscan checkpoint scanning into CI/CD pipelines before any checkpoint is loaded in training or evaluation. 6. Restrict training job network access and file system permissions to limit post-exploitation lateral movement. 7. Treat any checkpoint from an unverified source as untrusted until cryptographically validated.
What systems are affected by CVE-2026-31253?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model evaluation, distributed training, fine-tuning workflows, model serving.
What is the CVSS score for CVE-2026-31253?
CVE-2026-31253 has a CVSS v3.1 base score of 7.3 (HIGH). The EPSS exploitation probability is 0.22%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.003 Model AML.T0011.000 Unsafe AI Artifacts AML.T0018.002 Embed Malware AML.T0025 Exfiltration via Cyber Means Compliance Controls Affected
What are the technical details?
Original Advisory
The flash-attention training framework thru commit e724e2588cbe754beb97cf7c011b5e7e34119e62 (2025-13-04) contains an insecure deserialization vulnerability (CWE-502) in its checkpoint loading mechanism. The load_checkpoint() function in checkpoint.py and the checkpoint loading code in eval.py use torch.load() without enabling the security-restrictive weights_only=True parameter. This allows the deserialization of arbitrary Python objects via the pickle module. An attacker can exploit this by providing a maliciously crafted checkpoint file. When a victim loads this checkpoint during model warmstarting or evaluation, arbitrary code is executed on the victim's system.
Exploitation Scenario
An adversary targets an ML engineering team by contributing a 'pre-trained checkpoint' to a shared internal model registry or delivering it via a spearphish posing as a collaborative researcher. The checkpoint is a structurally valid PyTorch file containing an embedded pickle payload that establishes a reverse shell or exfiltrates cloud IAM credentials to an attacker-controlled endpoint. A team member runs evaluation using flash-attention's eval.py — torch.load() deserializes the pickle payload without restriction and executes the adversary's code in the context of the GPU training job. On a cloud instance with an attached IAM role, this immediately yields cloud provider credentials. On a shared compute cluster, the adversary pivots to other active jobs via shared NFS storage, broadening the compromise to the entire team's infrastructure.
Weaknesses (CWE)
CWE-94 — Improper Control of Generation of Code ('Code Injection'): The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the syntax or behavior of the intended code segment.
- [Architecture and Design] Refactor your program so that you do not have to dynamically generate code.
- [Architecture and Design] Run your code in a "jail" or similar sandbox environment that enforces strict boundaries between the process and the operating system. This may effectively restrict which code can be executed by your product. Examples include the Unix chroot jail and AppArmor. In general, managed code may provide some protection. This may not be a feasible solution, and it only limits the impact to the operating system; the rest of your application may still be subject to compromise. Be careful to avoid CWE-243 and other weaknesses related to jails.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:L References
Timeline
Related Vulnerabilities
CVE-2024-5452 9.8 pytorch-lightning: RCE via deepdiff Delta deserialization
Same package: torch CVE-2023-43654 9.8 TorchServe: SSRF + RCE via unrestricted model URL loading
Same package: torch CVE-2022-45907 9.8 PyTorch: RCE via unsafe eval in JIT annotations
Same package: torch CVE-2022-0845 9.8 pytorch-lightning: code injection enables full RCE
Same package: torch CVE-2024-35198 9.8 TorchServe: URL bypass enables arbitrary model loading
Same package: torch