AI Security Research

2,589+ academic papers on AI security, attacks, and defenses

Total

2,589

Attack

998

Benchmark

740

Defense

355

Tool

276

Survey

147

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1161–1180 of 2,589 papers

Defense LOW

From Detection to Prevention: Explaining Security-Critical Code to Avoid Vulnerabilities

Ranjith Krishnamurthy, Oshando Johnson, Goran Piskachev +1 more

Security vulnerabilities often arise unintentionally during development due to a lack of security expertise and code complexity. Traditional tools,...

3 months ago cs.CR cs.AI cs.SE PDF

Attack HIGH

Jailbreaking LLMs via Calibration

Yuxuan Lu, Yongkang Guo, Yuqing Kong

Safety alignment in Large Language Models (LLMs) often creates a systematic discrepancy between a model's aligned output and the underlying...

3 months ago cs.CL cs.AI cs.CR PDF

Tool HIGH

DECEIVE-AFC: Adversarial Claim Attacks against Search-Enabled LLM-based Fact-Checking Systems

Haoran Ou, Kangjie Chen, Gelei Deng +4 more

Fact-checking systems with search-enabled large language models (LLMs) have shown strong potential for verifying claims by dynamically retrieving...

3 months ago cs.CR cs.AI PDF

Tool MEDIUM

When Agents "Misremember" Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems

Naen Xu, Hengyu An, Shuo Shi +7 more

Recent advancements in large language models (LLMs) have significantly enhanced the capabilities of collaborative multi-agent systems, enabling them...

3 months ago cs.CL cs.AI cs.CR PDF

Attack HIGH

Text is All You Need for Vision-Language Model Jailbreaking

Yihang Chen, Zhao Xu, Youyuan Jiang +2 more

Large Vision-Language Models (LVLMs) are increasingly equipped with robust safety safeguards to prevent responses to harmful or disallowed prompts....

3 months ago cs.CV cs.AI cs.CR PDF

Defense MEDIUM

A Fragile Guardrail: Diffusion LLM's Safety Blessing and Its Failure Mode

Zeyuan He, Yupeng Chen, Lang Lin +7 more

Diffusion large language models (D-LLMs) offer an alternative to autoregressive LLMs (AR-LLMs) and have demonstrated advantages in generation...

3 months ago cs.LG PDF

Attack HIGH

"Someone Hid It": Query-Agnostic Black-Box Attacks on LLM-Based Retrieval

Jiate Li, Defu Cao, Li Li +8 more

Large language models (LLMs) have been serving as effective backbones for retrieval systems, including Retrieval-Augmentation-Generation (RAG), Dense...

3 months ago cs.CR PDF

Attack HIGH

Optimal Transport-Guided Adversarial Attacks on Graph Neural Network-Based Bot Detection

Kunal Mukherjee, Zulfikar Alom, Tran Gia Bao Ngo +2 more

The rise of bot accounts on social media poses significant risks to public discourse. To address this threat, modern bot detectors increasingly rely...

3 months ago cs.LG cs.AI cs.CR PDF

Survey HIGH

Semantics-Preserving Evasion of LLM Vulnerability Detectors

Luze Sun, Alina Oprea, Eric Wong

LLM-based vulnerability detectors are increasingly deployed in security-critical code review, yet their resilience to evasion under...

3 months ago cs.CR cs.AI cs.LG PDF

Defense MEDIUM

Assessing Domain-Level Susceptibility to Emergent Misalignment from Narrow Finetuning

Abhishek Mishra, Mugilan Arulvanan, Reshma Ashok +3 more

Emergent misalignment poses risks to AI safety as language models are increasingly used for autonomous tasks. In this paper, we present a population...

3 months ago cs.AI PDF

Benchmark LOW

LogicGaze: Benchmarking Causal Consistency in Visual Narratives via Counterfactual Verification

Rory Driscoll, Alexandros Christoforos, Chadbourne Davis

While sequential reasoning enhances the capability of Vision-Language Models (VLMs) to execute complex multimodal tasks, their reliability in...

3 months ago cs.CV cs.AI PDF

Attack HIGH

Now You Hear Me: Audio Narrative Attacks Against Large Audio-Language Models

Ye Yu, Haibo Jin, Yaoning Yu +2 more

Large audio-language models increasingly operate on raw speech inputs, enabling more seamless integration across domains such as voice assistants,...

3 months ago cs.CL cs.AI cs.CR PDF

Defense MEDIUM

Tri-LLM Cooperative Federated Zero-Shot Intrusion Detection with Semantic Disagreement and Trust-Aware Aggregation

Saeid Jamshidi, Omar Abdul Wahab, Foutse Khomh +1 more

Federated learning (FL) has become an effective paradigm for privacy-preserving, distributed Intrusion Detection Systems (IDS) in cyber-physical and...

3 months ago cs.CR cs.AI PDF

Defense MEDIUM

RAudit: A Blind Auditing Protocol for Large Language Model Reasoning

Edward Y. Chang, Longling Geng

Inference-time scaling can amplify reasoning pathologies: sycophancy, rung collapse, and premature certainty. We present RAudit, a diagnostic...

3 months ago cs.AI PDF

Tool LOW

Secure Tool Manifest and Digital Signing Solution for Verifiable MCP and LLM Pipelines

Saeid Jamshidi, Kawser Wazed Nafi, Arghavan Moradi Dakhel +3 more

Large Language Models (LLMs) are increasingly adopted in sensitive domains such as healthcare and financial institutions' data analytics; however,...

3 months ago cs.CR cs.AI PDF

Attack MEDIUM

WiFiPenTester: Advancing Wireless Ethical Hacking with Governed GenAI

Haitham S. Al-Sinani, Chris J. Mitchell

Wireless ethical hacking relies heavily on skilled practitioners manually interpreting reconnaissance results and executing complex, time-sensitive...

3 months ago cs.CR cs.AI PDF

Attack HIGH

From Similarity to Vulnerability: Key Collision Attack on LLM Semantic Caching

Zhixiang Zhang, Zesen Liu, Yuchong Xie +2 more

Semantic caching has emerged as a pivotal technique for scaling LLM applications, widely adopted by major providers including AWS and Microsoft. By...

3 months ago cs.CR cs.AI PDF

Defense MEDIUM

Character as a Latent Variable in Large Language Models: A Mechanistic Account of Emergent Misalignment and Conditional Safety Failures

Yanghao Su, Wenbo Zhou, Tianwei Zhang +4 more

Emergent Misalignment refers to a failure mode in which fine-tuning large language models (LLMs) on narrowly scoped data induces broadly misaligned...

3 months ago cs.CL cs.AI cs.CR PDF

Benchmark LOW

SolAgent: A Specialized Multi-Agent Framework for Solidity Code Generation

Wei Chen, Zhiyuan Peng, Xin Yin +4 more

Smart contracts are the backbone of the decentralized web, yet ensuring their functional correctness and security remains a critical challenge. While...

3 months ago cs.SE PDF

Benchmark HIGH

Sifting the Noise: A Comparative Study of LLM Agents in Vulnerability False Positive Filtering

Yunpeng Xiong, Ting Zhang

Static Application Security Testing (SAST) tools are essential for identifying software vulnerabilities, but they often produce a high volume of...

3 months ago cs.SE PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial