In-Context Representation Hijacking
Itay Yona, Amir Sarid, Michael Karasik +1 more
We introduce $\textbf{Doublespeak}$, a simple in-context representation hijacking attack against large language models (LLMs). The attack works by...
AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.
Showing 281–300 of 388 papers
Clear filtersItay Yona, Amir Sarid, Michael Karasik +1 more
We introduce $\textbf{Doublespeak}$, a simple in-context representation hijacking attack against large language models (LLMs). The attack works by...
Hanxiu Zhang, Yue Zheng
The protection of Intellectual Property (IP) in Large Language Models (LLMs) represents a critical challenge in contemporary AI research. While...
Thomas Rivasseau
Current research on operator control of Large Language Models improves model robustness against adversarial attacks and misbehavior by training on...
Adel Chehade, Edoardo Ragusa, Paolo Gastaldo +1 more
Traffic classification (TC) plays a critical role in cybersecurity, particularly in IoT and embedded contexts, where inspection must often occur...
Zixia Wang, Gaojie Jin, Jia Hu +1 more
Recent advancements in Large Language Models (LLMs) have led to their widespread adoption in daily applications. Despite their impressive...
Alexander Boyd, Franz Nowak, David Hyland +2 more
World models have been recently proposed as sandbox environments in which AI agents can be trained and evaluated before deployment. Although...
Aaron Sandoval, Cody Rushing
The field of AI Control seeks to develop robust control protocols, deployment safeguards for untrusted AI which may be intentionally subversive....
Adeela Bashir, The Anh han, Zia Ush Shamszaman
The integration of large language models (LLMs) into healthcare IoT systems promises faster decisions and improved medical support. LLMs are also...
K. J. Kevin Feng, Tae Soo Kim, Rock Yuren Pang +3 more
AI agents that take actions in their environment autonomously over extended time horizons require robust governance interventions to curb their...
Tong Wu, Weibin Wu, Zibin Zheng
Equipped with various tools and knowledge, GPTs, one kind of customized AI agents based on OpenAI's large language models, have illustrated great...
Zeng Wang, Minghao Shao, Akashdeep Saha +4 more
Graph neural networks (GNNs) have shown promise in hardware security by learning structural motifs from netlist graphs. However, this reliance on...
Herman Errico, Jiquan Ngiam, Shanita Sojan
The Model Context Protocol (MCP) replaces static, developer-controlled API integrations with more dynamic, user-driven agent systems, which also...
Sidahmed Benabderrahmane, James Cheney, Talal Rahwan
Advanced Persistent Threats (APTs) pose a significant challenge in cybersecurity due to their stealthy and long-term nature. Modern supervised...
Steven Peh
Large Language Models (LLMs) remain vulnerable to prompt injection attacks, representing the most significant security threat in production...
Adarsh Kumarappan, Ayushi Mehrotra
The SmoothLLM defense provides a certification guarantee against jailbreaking attacks, but it relies on a strict "k-unstable" assumption that rarely...
Itay Hazan, Yael Mathov, Guy Shtar +2 more
Securing AI agents powered by Large Language Models (LLMs) represents one of the most critical challenges in AI security today. Unlike traditional...
Atharv Singh Patlan, Peiyao Sheng, S. Ashwin Hebbar +2 more
Language agents are rapidly expanding from single-user assistants to multi-user collaborators in shared workspaces and groups. However, today's...
Tom Perel
The recent boom and rapid integration of Large Language Models (LLMs) into a wide range of applications warrants a deeper understanding of their...
Huseein Jawad, Nicolas Brunel
System prompts are critical for guiding the behavior of Large Language Models (LLMs), yet they often contain proprietary logic or sensitive...
Fuyao Zhang, Jiaming Zhang, Che Wang +6 more
The reliance of mobile GUI agents on Multimodal Large Language Models (MLLMs) introduces a severe privacy vulnerability: screenshots containing...
AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.
AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.
Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.
Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.
Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial