AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 281–296 of 296 papers

Clear filters

Attack HIGH

INTARG: Informed Real-Time Adversarial Attack Generation for Time-Series Regression

Gamze Kirman Tokgoz, Onat Gungor, Tajana Rosing +1 more

Time-series forecasting aims to predict future values by modeling temporal dependencies in historical observations. It is a critical component of...

4 weeks ago cs.LG cs.CR PDF

Defense MEDIUM

Detecting Safety Violations Across Many Agent Traces

Adam Stein, Davis Brown, Hamed Hassani +2 more

To identify safety violations, auditors often search over large sets of agent traces. This search is difficult because failures are often rare,...

4 weeks ago cs.AI cs.CL PDF

Tool HIGH

ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection

Wei Zhao, Zhe Li, Peixin Zhang +1 more

Tool-augmented Large Language Model (LLM) agents have demonstrated impressive capabilities in automating complex, multi-step real-world tasks, yet...

4 weeks ago cs.CR cs.AI PDF

Benchmark MEDIUM

Towards Automated Pentesting with Large Language Models

Ricardo Bessa, Rui Claro, João Trindade +1 more

Large Language Models (LLMs) are redefining offensive cybersecurity by allowing the generation of harmful machine code with minimal human...

4 weeks ago cs.CR PDF

Benchmark LOW

DreamKG: A KG-Augmented Conversational System for People Experiencing Homelessness

Javad M Alizadeh, Genhui Zheng, Chiu C Tan +7 more

People experiencing homelessness (PEH) face substantial barriers to accessing timely, accurate information about community services. DreamKG...

4 weeks ago cs.AI PDF

Defense MEDIUM

LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety

Junxiao Yang, Haoran Liu, Jinzhe Tu +9 more

Large language models (LLMs) often demonstrate strong safety performance in high-resource languages, yet exhibit severe vulnerabilities when queried...

4 weeks ago cs.LG cs.AI cs.CL PDF

Defense LOW

SemaClaw: A Step Towards General-Purpose Personal AI Agents through Harness Engineering

Ningyan Zhu, Huacan Wang, Jie Zhou +8 more

The rise of OpenClaw in early 2026 marks the moment when millions of users began deploying personal AI agents into their daily lives, delegating...

4 weeks ago cs.AI PDF

Benchmark MEDIUM

RLSpoofer: A Lightweight Evaluator for LLM Watermark Spoofing Resilience

Hanbo Huang, Xuan Gong, Yiran Zhang +2 more

Large language model (LLM) watermarking has emerged as a promising approach for detecting and attributing AI-generated text, yet its robustness to...

4 weeks ago cs.CR PDF

Benchmark LOW

From Translation to Superset: Benchmark-Driven Evolution of a Production AI Agent from Rust to Python

Jinhua Wang, Biswa Sengupta

Cross-language migration of large software systems is a persistent engineering challenge, particularly when the source codebase evolves rapidly. We...

4 weeks ago cs.SE cs.AI PDF

Benchmark MEDIUM

RedShell: A Generative AI-Based Approach to Ethical Hacking

Ricardo Bessa, Rui Claro, João Trindade +1 more

The application of Machine Learning techniques in code generation is now a common practice for most developers. Tools such as ChatGPT from OpenAI...

4 weeks ago cs.CR PDF

Benchmark LOW

Beyond RAG for Cyber Threat Intelligence: A Systematic Evaluation of Graph-Based and Agentic Retrieval

Dzenan Hamzic, Florian Skopik, Max Landauer +2 more

Cyber threat intelligence (CTI) analysts must answer complex questions over large collections of narrative security reports. Retrieval-augmented...

4 weeks ago cs.AI cs.CR PDF

Other MEDIUM

CLASP: Closed-loop Asynchronous Spatial Perception for Open-vocabulary Desktop Object Grasping

Yiran Ling, Wenxuan Li, Siying Dong +5 more

Robot grasping of desktop object is widely used in intelligent manufacturing, logistics, and agriculture.Although vision-language models (VLMs) show...

4 weeks ago cs.RO PDF

Tool HIGH

The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems

Yihao Zhang, Kai Wang, Jiangrong Wu +7 more

Large Language Models (LLMs) face prominent security risks from jailbreaking, a practice that manipulates models to bypass built-in security...

4 weeks ago cs.CR cs.AI cs.CL PDF

Attack LOW

Dialectic-Med: Mitigating Diagnostic Hallucinations via Counterfactual Adversarial Multi-Agent Debate

Zhixiang Lu, Jionglong Su

Multimodal Large Language Models (MLLMs) in healthcare suffer from severe confirmation bias, often hallucinating visual details to support initial,...

4 weeks ago cs.CL PDF

Attack HIGH

QShield: Securing Neural Networks Against Adversarial Attacks using Quantum Circuits

Navid Azimi, Aditya Prakash, Yao Wang +1 more

Deep neural networks remain highly vulnerable to adversarial perturbations, limiting their reliability in security- and safety-critical applications....

4 weeks ago cs.CR cs.AI cs.CV PDF

Attack MEDIUM

Beyond A Fixed Seal: Adaptive Stealing Watermark in Large Language Models

Shuhao Zhang, Yuli Chen, Jiale Han +2 more

Watermarking provides a critical safeguard for large language model (LLM) services by facilitating the detection of LLM-generated text....

4 weeks ago cs.CR cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial