AI Security Research

2,589+ academic papers on AI security, attacks, and defenses

Total

2,589

Attack

998

Benchmark

740

Defense

355

Tool

276

Survey

147

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 861–880 of 1,931 papers

Clear filters

Attack MEDIUM

Efficient Multi-Party Secure Comparison over Different Domains with Preprocessing Assistance

Kaiwen Wang, Xiaolin Chang, Yuehan Dong +1 more

Secure comparison is a fundamental primitive in multi-party computation, supporting privacy-preserving applications such as machine learning and data...

2 months ago cs.CR PDF

Attack HIGH

VALD: Multi-Stage Vision Attack Detection for Efficient LVLM Defense

Nadav Kadvil, Malak Fares, Ayellet Tal

Large Vision-Language Models (LVLMs) can be vulnerable to adversarial images that subtly bias their outputs toward plausible yet incorrect responses....

2 months ago cs.CV PDF

Attack HIGH

Agentic AI as a Cybersecurity Attack Surface: Threats, Exploits, and Defenses in Runtime Supply Chains

Xiaochong Jiang, Shiqi Yang, Wenting Yang +2 more

Agentic systems built on large language models (LLMs) extend beyond text generation to autonomously retrieve information and invoke tools. This...

2 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

CIBER: A Comprehensive Benchmark for Security Evaluation of Code Interpreter Agents

Lei Ba, Qinbin Li, Songze Li

LLM-based code interpreter agents are increasingly deployed in critical workflows, yet their robustness against risks introduced by their code...

2 months ago cs.CR PDF

Tool HIGH

Can a Teenager Fool an AI? Evaluating Low-Cost Cosmetic Attacks on Age Estimation Systems

Xingyu Shen, Tommy Duong, Xiaodong An +6 more

Age estimation systems are increasingly deployed as gatekeepers for age-restricted online content, yet their robustness to cosmetic modifications has...

2 months ago cs.CV cs.CR cs.LG PDF

Benchmark MEDIUM

CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions

Jingwei Shi, Xinxiang Yin, Jing Huang +2 more

The evaluation of Large Language Models (LLMs) for code generation relies heavily on the quality and robustness of test cases. However, existing...

2 months ago cs.SE cs.AI cs.CR PDF

Survey HIGH

Red-Teaming Claude Opus and ChatGPT-based Security Advisors for Trusted Execution Environments

Kunal Mukherjee

Trusted Execution Environments (TEEs) (e.g., Intel SGX and ArmTrustZone) aim to protect sensitive computation from a compromised operating system,...

2 months ago cs.CR cs.AI PDF

Attack HIGH

Hiding in Plain Text: Detecting Concealed Jailbreaks via Activation Disentanglement

Amirhossein Farzam, Majid Behabahani, Mani Malek +2 more

Large language models (LLMs) remain vulnerable to jailbreak prompts that are fluent and semantically coherent, and therefore difficult to detect with...

2 months ago cs.AI PDF

Attack HIGH

Prompt Injection as Role Confusion

Charles Ye, Jasmine Cui, Dylan Hadfield-Menell

Language models remain vulnerable to prompt injection attacks despite extensive safety training. We trace this failure to role confusion: models...

2 months ago cs.CL cs.AI cs.CR PDF

Attack HIGH

Prompt Injection as Role Confusion

Charles Ye, Jasmine Cui, Dylan Hadfield-Menell

Language models remain vulnerable to prompt injection attacks despite extensive safety training. We trace this failure to role confusion: models...

2 months ago cs.CL cs.AI cs.CR PDF

Tool MEDIUM

ILION: Deterministic Pre-Execution Safety Gates for Agentic AI Systems

Florin Adrian Chitan

The proliferation of autonomous AI agents capable of executing real-world actions - filesystem operations, API calls, database modifications,...

2 months ago cs.AI cs.CR PDF

Attack HIGH

Dark and Bright Side of Participatory Red-Teaming with Targets of Stereotyping for Eliciting Harmful Behaviors from Large Language Models

Sieun Kim, Yeeun Jo, Sungmin Na +5 more

Red-teaming, where adversarial prompts are crafted to expose harmful behaviors and assess risks, offers a dynamic approach to surfacing underlying...

2 months ago cs.HC PDF

Survey MEDIUM

LLM Scalability Risk for Agentic-AI and Model Supply Chain Security

Kiarash Ahi, Vaibhav Agrawal, Saeed Valizadeh

Large Language Models (LLMs) & Generative AI are transforming cybersecurity, enabling both advanced defenses and new attacks. Organizations now use...

2 months ago cs.CR PDF

Tool MEDIUM

AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems

Emmanuel Bamidele

Long-running LLM agents require persistent memory to preserve state across interactions, yet most deployed systems manage memory with age-based...

2 months ago cs.DC cs.AI cs.LG PDF

Attack HIGH

When Backdoors Go Beyond Triggers: Semantic Drift in Diffusion Models Under Encoder Attacks

Shenyang Chen, Liuwan Zhu

Standard evaluations of backdoor attacks on text-to-image (T2I) models primarily measure trigger activation and visual fidelity. We challenge this...

2 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

LoMime: Query-Efficient Membership Inference using Model Extraction in Label-Only Settings

Abdullah Caglar Oksuz, Anisa Halimi, Erman Ayday

Membership inference attacks (MIAs) threaten the privacy of machine learning models by revealing whether a specific data point was used during...

2 months ago cs.LG cs.CR PDF

Defense MEDIUM

MANATEE: Inference-Time Lightweight Diffusion Based Safety Defense for LLMs

Chun Yan Ryan Kan, Tommy Tran, Vedant Yadav +4 more

Defending LLMs against adversarial jailbreak attacks remains an open challenge. Existing defenses rely on binary classifiers that fail when...

2 months ago cs.CR cs.AI cs.CL PDF

Attack HIGH

When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models

Zafir Shamsi, Nikhil Chekuru, Zachary Guzman +1 more

Large Language Models (LLMs) are increasingly integrated into high-stakes applications, making robust safety guarantees a central practical and...

2 months ago cs.CL cs.AI PDF

Benchmark LOW

Many AI Analysts, One Dataset: Navigating the Agentic Data Science Multiverse

Martin Bertran, Riccardo Fogliato, Zhiwei Steven Wu

Empirical conclusions depend not only on data but on analytic decisions made throughout the research process. Many-analyst studies have quantified...

2 months ago cs.AI cs.LG PDF

Benchmark HIGH

FENCE: A Financial and Multimodal Jailbreak Detection Dataset

Mirae Kim, Seonghun Jeong, Youngjun Kwak

Jailbreaking poses a significant risk to the deployment of Large Language Models (LLMs) and Vision Language Models (VLMs). VLMs are particularly...

2 months ago cs.CL cs.AI cs.DB PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial