ShadowLogic: Backdoors in Any Whitebox LLM
Kasimir Schulz, Amelia Kawasaki, Leo Ring
Large language models (LLMs) are widely deployed across various applications, often with safeguards to prevent the generation of harmful or...
AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.
Showing 321–340 of 388 papers
Clear filtersKasimir Schulz, Amelia Kawasaki, Leo Ring
Large language models (LLMs) are widely deployed across various applications, often with safeguards to prevent the generation of harmful or...
David Lüdke, Tom Wollschläger, Paul Ungermann +2 more
We introduce a novel framework that transforms the resource-intensive (adversarial) prompt optimization problem into an \emph{efficient, amortized...
Chenghao Du, Quanfeng Huang, Tingxuan Tang +3 more
Large Language Models (LLMs) have transformed software development, enabling AI-powered applications known as LLM-based agents that promise to...
Haohua Duan, Liyao Xiang, Xin Zhang
Watermarking schemes for large language models (LLMs) have been proposed to identify the source of the generated text, mitigating the potential...
Lisha Shuai, Jiuling Dong, Nan Zhang +5 more
Local Differential Privacy (LDP) is a widely adopted privacy-protection model in the Internet of Things (IoT) due to its lightweight, decentralized,...
Guangzhi Su, Shuchang Huang, Yutong Ke +3 more
Multimodal large language models (MLLMs) have achieved impressive performance across diverse tasks by jointly reasoning over textual and visual...
Elizabeth Lin, Jonah Ghebremichael, William Enck +5 more
Software supply chains, while providing immense economic and software development value, are only as strong as their weakest link. Over the past...
Myeongseob Ko, Nikhil Reddy Billa, Adam Nguyen +3 more
The memorization of training data in large language models (LLMs) poses significant privacy and copyright concerns. Existing data extraction methods,...
Bin Wang, YiLu Zhong, MiDi Wan +4 more
Large language models (LLMs) have become indispensable for automated code generation, yet the quality and security of their outputs remain a critical...
Jiaxiang Liu, Jiawei Du, Xiao Liu +2 more
Pre-trained vision-language models (VLMs) such as CLIP have demonstrated strong zero-shot capabilities across diverse domains, yet remain highly...
Devon A. Kelly, Christiana Chamon
Wide-bandgap (WBG) technologies offer unprecedented improvements in power system efficiency, size, and performance, but also introduce unique sensor...
Sarah Ball, Niki Hasrati, Alexander Robey +4 more
Discrete optimization-based jailbreaking attacks on large language models aim to generate short, nonsensical suffixes that, when appended onto input...
Soham Hans, Stacy Marsella, Sophia Hirschmann +1 more
Understanding adversarial behavior in cybersecurity has traditionally relied on high-level intelligence reports and manual interpretation of attack...
Austin Jia, Avaneesh Ramesh, Zain Shamsi +2 more
Retrieval-Augmented Generation (RAG) has emerged as the dominant architectural pattern to operationalize Large Language Model (LLM) usage in Cyber...
Daniel Gilkarov, Ran Dubin
Pretrained deep learning model sharing holds tremendous value for researchers and enterprises alike. It allows them to apply deep learning by...
Tushar Nayan, Ziqi Zhang, Ruimin Sun
With the increasing deployment of Large Language Models (LLMs) on mobile and edge platforms, securing them against model extraction attacks has...
Petar Radanliev
Problem Space: AI Vulnerabilities and Quantum Threats Generative AI vulnerabilities: model inversion, data poisoning, adversarial inputs. Quantum...
Yushi Yang, Shreyansh Padarha, Andrew Lee +1 more
Agentic reinforcement learning (RL) trains large language models to autonomously call tools during reasoning, with search as the most common...
Elias Hossain, Swayamjit Saha, Somshubhra Roy +1 more
Even when prompts and parameters are secured, transformer language models remain vulnerable because their key-value (KV) cache during inference...
Jie Zhang, Meng Ding, Yang Liu +2 more
We present a novel approach for attacking black-box large language models (LLMs) by exploiting their ability to express confidence in natural...
AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.
AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.
Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.
Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.
Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial