AI Security Research

2,545+ academic papers on AI security, attacks, and defenses

Total

2,545

Attack

978

Benchmark

730

Defense

348

Tool

274

Survey

143

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 161–180 of 348 papers

Clear filters

Defense LOW

Engineering AI Agents for Clinical Workflows: A Case Study in Architecture,MLOps, and Governance

Cláudio Lúcio do Val Lopes, João Marcus Pitta, Fabiano Belém +2 more

The integration of Artificial Intelligence (AI) into clinical settings presents a software engineering challenge, demanding a shift from isolated...

3 months ago cs.AI cs.SE PDF

Defense LOW

From Detection to Prevention: Explaining Security-Critical Code to Avoid Vulnerabilities

Ranjith Krishnamurthy, Oshando Johnson, Goran Piskachev +1 more

Security vulnerabilities often arise unintentionally during development due to a lack of security expertise and code complexity. Traditional tools,...

3 months ago cs.CR cs.AI cs.SE PDF

Defense MEDIUM

A Fragile Guardrail: Diffusion LLM's Safety Blessing and Its Failure Mode

Zeyuan He, Yupeng Chen, Lang Lin +7 more

Diffusion large language models (D-LLMs) offer an alternative to autoregressive LLMs (AR-LLMs) and have demonstrated advantages in generation...

3 months ago cs.LG PDF

Defense MEDIUM

Assessing Domain-Level Susceptibility to Emergent Misalignment from Narrow Finetuning

Abhishek Mishra, Mugilan Arulvanan, Reshma Ashok +3 more

Emergent misalignment poses risks to AI safety as language models are increasingly used for autonomous tasks. In this paper, we present a population...

3 months ago cs.AI PDF

Defense MEDIUM

Tri-LLM Cooperative Federated Zero-Shot Intrusion Detection with Semantic Disagreement and Trust-Aware Aggregation

Saeid Jamshidi, Omar Abdul Wahab, Foutse Khomh +1 more

Federated learning (FL) has become an effective paradigm for privacy-preserving, distributed Intrusion Detection Systems (IDS) in cyber-physical and...

3 months ago cs.CR cs.AI PDF

Defense MEDIUM

RAudit: A Blind Auditing Protocol for Large Language Model Reasoning

Edward Y. Chang, Longling Geng

Inference-time scaling can amplify reasoning pathologies: sycophancy, rung collapse, and premature certainty. We present RAudit, a diagnostic...

3 months ago cs.AI PDF

Defense MEDIUM

Character as a Latent Variable in Large Language Models: A Mechanistic Account of Emergent Misalignment and Conditional Safety Failures

Yanghao Su, Wenbo Zhou, Tianwei Zhang +4 more

Emergent Misalignment refers to a failure mode in which fine-tuning large language models (LLMs) on narrowly scoped data induces broadly misaligned...

3 months ago cs.CL cs.AI cs.CR PDF

Defense MEDIUM

Hide and Seek in Embedding Space: Geometry-based Steganography and Detection in Large Language Models

Charles Westphal, Keivan Navaie, Fernando E. Rosas

Fine-tuned LLMs can covertly encode prompt secrets into outputs via steganographic channels. Prior work demonstrated this threat but relied on...

3 months ago cs.CR cs.AI PDF

Defense MEDIUM

Okara: Detection and Attribution of TLS Man-in-the-Middle Vulnerabilities in Android Apps with Foundation Models

Haoyun Yang, Ronghong Huang, Yong Fang +4 more

Transport Layer Security (TLS) is fundamental to secure online communication, yet vulnerabilities in certificate validation that enable...

3 months ago cs.CR PDF

Defense MEDIUM

Eliciting Least-to-Most Reasoning for Phishing URL Detection

Holly Trikilis, Pasindu Marasinghe, Fariza Rashid +1 more

Phishing continues to be one of the most prevalent attack vectors, making accurate classification of phishing URLs essential. Recently, large...

3 months ago cs.CR cs.AI PDF

Defense LOW

Semantic Uncertainty Quantification of Hallucinations in LLMs: A Quantum Tensor Network Based Method

Pragatheeswaran Vipulanandan, Kamal Premaratne, Dilip Sarkar

Large language models (LLMs) exhibit strong generative capabilities but remain vulnerable to confabulations, fluent yet unreliable outputs that vary...

3 months ago cs.CL PDF

Defense MEDIUM

From Internal Diagnosis to External Auditing: A VLM-Driven Paradigm for Online Test-Time Backdoor Defense

Binyan Xu, Fan Yang, Xilin Dai +2 more

Deep Neural Networks remain inherently vulnerable to backdoor attacks. Traditional test-time defenses largely operate under the paradigm of internal...

3 months ago cs.LG cs.CR PDF

Defense LOW

An Agentic AI Control Plane for 6G Network Slice Orchestration, Monitoring, and Trading

Eranga Bandara, Ross Gore, Sachin Shetty +9 more

6G networks are expected to be AI-native, intent-driven, and economically programmable, requiring fundamentally new approaches to network slice...

3 months ago cs.NI cs.AI PDF

Defense LOW

Epistemic Traps: Rational Misalignment Driven by Model Misspecification

Xingcheng Xu, Jingjing Qu, Qiaosheng Zhang +4 more

The rapid deployment of Large Language Models and AI agents across critical societal and technical domains is hindered by persistent behavioral...

3 months ago cs.AI cs.CL cs.LG PDF

Defense MEDIUM

Proactive Hardening of LLM Defenses with HASTE

Henry Chen, Victor Aranda, Samarth Keshari +2 more

Prompt-based attack techniques are one of the primary challenges in securely deploying and protecting LLM-based AI systems. LLM inputs are an...

3 months ago cs.CR PDF

Defense HIGH

MulVul: Retrieval-augmented Multi-Agent Code Vulnerability Detection via Cross-Model Prompt Evolution

Zihan Wu, Jie Xu, Yun Peng +2 more

Large Language Models (LLMs) struggle to automate real-world vulnerability detection due to two key limitations: the heterogeneity of vulnerability...

3 months ago cs.SE cs.AI PDF

Defense LOW

V-Loop: Visual Logical Loop Verification for Hallucination Detection in Medical Visual Question Answering

Mengyuan Jin, Zehui Liao, Yong Xia

Multimodal Large Language Models (MLLMs) have shown remarkable capability in assisting disease diagnosis in medical visual question answering (VQA)....

3 months ago cs.CV PDF

Defense MEDIUM

When Personalization Legitimizes Risks: Uncovering Safety Vulnerabilities in Personalized Dialogue Agents

Jiahe Guo, Xiangran Guo, Yulin Hu +8 more

Long-term memory enables large language model (LLM) agents to support personalized and sustained interactions. However, most work on personalized...

3 months ago cs.AI PDF

Defense MEDIUM

SafeThinker: Reasoning about Risk to Deepen Safety Beyond Shallow Alignment

Xianya Fang, Xianying Luo, Yadong Wang +8 more

Despite the intrinsic risk-awareness of Large Language Models (LLMs), current defenses often result in shallow safety alignment, rendering models...

3 months ago cs.CR cs.AI PDF

Defense LOW

Do VLMs Have a Moral Backbone? A Study on the Fragile Morality of Vision-Language Models

Zhining Liu, Tianyi Wang, Xiao Lin +9 more

Despite substantial efforts toward improving the moral alignment of Vision-Language Models (VLMs), it remains unclear whether their ethical judgments...

3 months ago cs.CY cs.AI cs.CL PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial