AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 261–280 of 295 papers

Clear filters

Benchmark MEDIUM

Parallax: Why AI Agents That Think Must Never Act

Joel Fokou

Autonomous AI agents are rapidly transitioning from experimental tools to operational infrastructure, with projections that 80% of enterprise...

4 weeks ago cs.CR cs.AI PDF

Attack HIGH

Challenging Vision-Language Models with Physically Deployable Multimodal Semantic Lighting Attacks

Yingying Zhao, Chengyin Hu, Qike Zhang +7 more

Vision-Language Models (VLMs) have shown remarkable performance, yet their security remains insufficiently understood. Existing adversarial studies...

4 weeks ago cs.CV PDF

Attack MEDIUM

Understanding and Improving Continuous Adversarial Training for LLMs via In-context Learning Theory

Shaopeng Fu, Di Wang

Adversarial training (AT) is an effective defense for large language models (LLMs) against jailbreak attacks, but performing AT on LLMs is costly. To...

4 weeks ago cs.LG cs.CR stat.ML PDF

Attack MEDIUM

Robust Semi-Supervised Temporal Intrusion Detection for Adversarial Cloud Networks

Anasuya Chattopadhyay, Daniel Reti, Hans D. Schotten

Cloud networks increasingly rely on machine learning based Network Intrusion Detection Systems to defend against evolving cyber threats. However,...

4 weeks ago cs.LG cs.CR PDF

Attack HIGH

Every Picture Tells a Dangerous Story: Memory-Augmented Multi-Agent Jailbreak Attacks on VLMs

Jianhao Chen, Haoyang Chen, Hanjie Zhao +2 more

The rapid evolution of Vision-Language Models (VLMs) has catalyzed unprecedented capabilities in artificial intelligence; however, this continuous...

4 weeks ago cs.AI cs.MM PDF

Attack MEDIUM

LLM-Guided Prompt Evolution for Password Guessing

Vladimir A. Mazin, Mikhail A. Zorin, Dmitrii S. Korzh +3 more

Passwords still remain a dominant authentication method, yet their security is routinely subverted by predictable user choices and large-scale...

4 weeks ago cs.CR cs.AI PDF

Attack HIGH

DeepSeek Robustness Against Semantic-Character Dual-Space Mutated Prompt Injection

Junyu Ren, Xingjian Pan, Wensheng Gan +1 more

Prompt injection has emerged as a critical security threat to large language models (LLMs), yet existing studies predominantly focus on...

4 weeks ago cs.CR PDF

Benchmark MEDIUM

VeriX-Anon: A Multi-Layered Framework for Mathematically Verifiable Outsourced Target-Driven Data Anonymization

Miit Daga, Swarna Priya Ramu

Organisations increasingly outsource privacy-sensitive data transformations to cloud providers, yet no practical mechanism lets the data owner verify...

4 weeks ago cs.CR cs.DB cs.LG PDF

Attack HIGH

Reading Between the Pixels: Linking Text-Image Embedding Alignment to Typographic Attack Success on Vision-Language Models

Ravikumar Balakrishnan, Sanket Mendapara, Ankit Garg

We study typographic prompt injection attacks on vision-language models (VLMs), where adversarial text is rendered as images to bypass safety...

4 weeks ago cs.CV PDF

Benchmark MEDIUM

Compiling Activation Steering into Weights via Null-Space Constraints for Stealthy Backdoors

Rui Yin, Tianxu Han, Naen Xu +8 more

Safety-aligned large language models (LLMs) are increasingly deployed in real-world pipelines, yet this deployment also enlarges the supply-chain...

4 weeks ago cs.CR cs.CL PDF

Other LOW

GAM: Hierarchical Graph-based Agentic Memory for LLM Agents

Zhaofen Wu, Hanrong Zhang, Fulin Lin +9 more

To sustain coherent long-term interactions, Large Language Model (LLM) agents must navigate the tension between acquiring new information and...

4 weeks ago cs.AI PDF

Attack HIGH

WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents

Yulin Chen, Tri Cao, Haoran Li +7 more

Web agents powered by vision-language models (VLMs) enable autonomous interaction with web environments by perceiving and acting on both visual and...

4 weeks ago cs.CR PDF

Attack HIGH

TEMPLATEFUZZ: Fine-Grained Chat Template Fuzzing for Jailbreaking and Red Teaming LLMs

Qingchao Shen, Zibo Xiao, Lili Huang +3 more

Large Language Models (LLMs) are increasingly deployed across diverse domains, yet their vulnerability to jailbreak attacks, where adversarial inputs...

4 weeks ago cs.CR cs.AI cs.SE PDF

Benchmark MEDIUM

From IOCs to Regex: Automating CTI Operationalization for SOC with LLMs

Pei-Yu Tseng, Lan Zhang, ZihDwo Yeh +3 more

Cyber Threat Intelligence (CTI) reports contain Indicators of Compromise (IOCs) that are critical for security operations. To operationalize these...

4 weeks ago cs.CR PDF

Tool MEDIUM

TimeMark: A Trustworthy Time Watermarking Framework for Exact Generation-Time Recovery from AIGC

Shangkun Che, Silin Du, Ge Gao

The widespread use of Large Language Models (LLMs) in text generation has raised increasing concerns about intellectual property disputes....

4 weeks ago cs.CR cs.CL PDF

Attack MEDIUM

AdversarialCoT: Single-Document Retrieval Poisoning for LLM Reasoning

Hongru Song, Yu-An Liu, Ruqing Zhang +4 more

Retrieval-augmented generation (RAG) enhances large language model (LLM) reasoning by retrieving external documents, but also opens up new attack...

4 weeks ago cs.IR PDF

Attack HIGH

COBALT-TLA: A Neuro-Symbolic Verification Loop for Cross-Chain Bridge Vulnerability Discovery

Dominik Blain

We present COBALT-TLA, a neuro-symbolic verification loop that pairs an LLM with TLC, the TLA+ model checker, in an automated REPL. The LLM generates...

4 weeks ago cs.CR cs.LO PDF

Attack MEDIUM

Fully Homomorphic Encryption on Llama 3 model for privacy preserving LLM inference

Anes Abdennebi, Nadjia Kara, Laaziz Lahlou

The applications of Generative Artificial Intelligence (GenAI) and their intersections with data-driven fields, such as healthcare, finance,...

4 weeks ago cs.CR cs.AI PDF

Defense MEDIUM

Towards Platonic Representation for Table Reasoning: A Foundation for Permutation-Invariant Retrieval

Willy Carlos Tchuitcheu, Tan Lu, Ann Dooms

Historical approaches to Table Representation Learning (TRL) have largely adopted the sequential paradigms of Natural Language Processing (NLP). We...

4 weeks ago cs.AI PDF

Defense LOW

A longitudinal health agent framework

Georgianna, Lin, Rencong Jiang +2 more

Although artificial intelligence (AI) agents are increasingly proposed to support potentially longitudinal health tasks, such as symptom management,...

4 weeks ago cs.AI cs.HC PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial