AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1041–1060 of 1,220 papers

Clear filters

Benchmark MEDIUM

DeepTx: Real-Time Transaction Risk Analysis via Multi-Modal Features and LLM Reasoning

Yixuan Liu, Xinlei Li, Yi Li

Phishing attacks in Web3 ecosystems are increasingly sophisticated, exploiting deceptive contract logic, malicious frontend scripts, and token...

6 months ago cs.CR PDF

Attack MEDIUM

Agentic Reinforcement Learning for Search is Unsafe

Yushi Yang, Shreyansh Padarha, Andrew Lee +1 more

Agentic reinforcement learning (RL) trains large language models to autonomously call tools during reasoning, with search as the most common...

6 months ago cs.CL PDF

Tool MEDIUM

Breaking and Fixing Defenses Against Control-Flow Hijacking in Multi-Agent Systems

Rishi Jha, Harold Triedman, Justin Wagle +1 more

Control-flow hijacking attacks manipulate orchestration mechanisms in multi-agent systems into performing unsafe actions that compromise the system...

6 months ago cs.LG cs.CR eess.SY PDF

Defense MEDIUM

Robustness in Text-Attributed Graph Learning: Insights, Trade-offs, and New Defenses

Runlin Lei, Lu Yi, Mingguo He +4 more

While Graph Neural Networks (GNNs) and Large Language Models (LLMs) are powerful approaches for learning on Text-Attributed Graphs (TAGs), a...

6 months ago cs.LG PDF

Attack MEDIUM

Can Transformer Memory Be Corrupted? Investigating Cache-Side Vulnerabilities in Large Language Models

Elias Hossain, Swayamjit Saha, Somshubhra Roy +1 more

Even when prompts and parameters are secured, transformer language models remain vulnerable because their key-value (KV) cache during inference...

6 months ago cs.CR cs.AI PDF

Defense MEDIUM

SafeSearch: Do Not Trade Safety for Utility in LLM Search Agents

Qiusi Zhan, Angeline Budiman-Chan, Abdelrahman Zayed +3 more

Large language model (LLM) based search agents iteratively generate queries, retrieve external information, and reason to answer open-domain...

6 months ago cs.CL PDF

Defense MEDIUM

SafeSearch: Do Not Trade Safety for Utility in LLM Search Agents

Qiusi Zhan, Angeline Budiman-Chan, Abdelrahman Zayed +3 more

Large language model (LLM) based search agents iteratively generate queries, retrieve external information, and reason to answer open-domain...

6 months ago cs.CL PDF

Defense MEDIUM

Investigating Safety Vulnerabilities of Large Audio-Language Models Under Speaker Emotional Variations

Bo-Han Feng, Chien-Feng Liu, Yu-Hsuan Li Liang +9 more

Large audio-language models (LALMs) extend text-based LLMs with auditory understanding, offering new opportunities for multimodal applications. While...

6 months ago cs.SD cs.AI cs.CL PDF

Tool MEDIUM

When AI Takes the Wheel: Security Analysis of Framework-Constrained Program Generation

Yue Liu, Zhenchang Xing, Shidong Pan +1 more

In recent years, the AI wave has grown rapidly in software development. Even novice developers can now design and generate complex...

6 months ago cs.SE cs.CR PDF

Attack MEDIUM

Black-box Optimization of LLM Outputs by Asking for Directions

Jie Zhang, Meng Ding, Yang Liu +2 more

We present a novel approach for attacking black-box large language models (LLMs) by exploiting their ability to express confidence in natural...

6 months ago cs.CR cs.LG PDF

Attack MEDIUM

DistilLock: Safeguarding LLMs from Unauthorized Knowledge Distillation on the Edge

Asmita Mohanty, Gezheng Kang, Lei Gao +1 more

Large Language Models (LLMs) have demonstrated strong performance across diverse tasks, but fine-tuning them typically relies on cloud-based,...

6 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

The Chameleon Nature of LLMs: Quantifying Multi-Turn Stance Instability in Search-Enabled Language Models

Shivam Ratnakar, Sanjay Raghavendra

Integration of Large Language Models with search/retrieval engines has become ubiquitous, yet these systems harbor a critical vulnerability that...

6 months ago cs.CL cs.AI PDF

Tool MEDIUM

Toward Understanding Security Issues in the Model Context Protocol Ecosystem

Xiaofan Li, Xing Gao

The Model Context Protocol (MCP) is an emerging open standard that enables AI-powered applications to interact with external tools through structured...

6 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

ATA: A Neuro-Symbolic Approach to Implement Autonomous and Trustworthy Agents

David Peer, Sebastian Stabinger

Large Language Models (LLMs) have demonstrated impressive capabilities, yet their deployment in high-stakes domains is hindered by inherent...

6 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

EditMark: Watermarking Large Language Models based on Model Editing

Shuai Li, Kejiang Chen, Jun Jiang +5 more

Large Language Models (LLMs) have demonstrated remarkable capabilities, but their training requires extensive data and computational resources,...

6 months ago cs.CR PDF

Attack MEDIUM

Detecting Adversarial Fine-tuning with Auditing Agents

Sarah Egler, John Schulman, Nicholas Carlini

Large Language Model (LLM) providers expose fine-tuning APIs that let end users fine-tune their frontier LLMs. Unfortunately, it has been shown that...

6 months ago cs.CR cs.AI PDF

Defense MEDIUM

SentinelNet: Safeguarding Multi-Agent Collaboration Through Credit-Based Dynamic Threat Detection

Yang Feng, Xudong Pan

Malicious agents pose significant threats to the reliability and decision-making capabilities of Multi-Agent Systems (MAS) powered by Large Language...

6 months ago cs.CR cs.AI PDF

Defense MEDIUM

MalCVE: Malware Detection and CVE Association Using Large Language Models

Eduard Andrei Cristea, Petter Molnes, Jingyue Li

Malicious software attacks are having an increasingly significant economic impact. Commercial malware detection software can be costly, and tools...

6 months ago cs.CR cs.SE PDF

Defense MEDIUM

HarmRLVR: Weaponizing Verifiable Rewards for Harmful LLM Alignment

Yuexiao Liu, Lijun Li, Xingjun Wang +1 more

Recent advancements in Reinforcement Learning with Verifiable Rewards (RLVR) have gained significant attention due to their objective and verifiable...

6 months ago cs.CR PDF

Survey MEDIUM

SoK: Taxonomy and Evaluation of Prompt Security in Large Language Models

Hanbin Hong, Shuya Feng, Nima Naderloui +6 more

Large Language Models (LLMs) have rapidly become integral to real-world applications, powering services across diverse sectors. However, their...

6 months ago cs.CR cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial