AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 201–220 of 1,220 papers

Clear filters

Benchmark MEDIUM

Bounded by Risk, Not Capability: Quantifying AI Occupational Substitution Rates via a Tech-Risk Dual-Factor Model

Shuyao Gao, Minghao Huang

The deployment of Large Language Models (LLMs) has ignited concerns about technological unemployment. Existing task-based evaluations predominantly...

1 months ago cs.CY econ.GN PDF

Tool MEDIUM

LLM-Enabled Open-Source Systems in the Wild: An Empirical Study of Vulnerabilities in GitHub Security Advisories

Fariha Tanjim Shifat, Hariswar Baburaj, Ce Zhou +2 more

Large language models (LLMs) are increasingly embedded in open-source software (OSS) ecosystems, creating complex interactions among natural language...

1 months ago cs.CR cs.SE PDF

Attack MEDIUM

Semantics Over Syntax: Uncovering Pre-Authentication 5G Baseband Vulnerabilities

Qiqing Huang, Xingyu Wang, Wanda Guo +2 more

Modern 5G user equipment (UE) processes Radio Resource Control (RRC) configuration messages during early control-plane exchanges, before...

1 months ago cs.CR PDF

Attack MEDIUM

Towards Unveiling Vulnerabilities of Large Reasoning Models in Machine Unlearning

Aobo Chen, Chenxu Zhao, Chenglin Miao +1 more

Large language models (LLMs) possess strong semantic understanding, driving significant progress in data mining applications. This is further...

1 months ago cs.LG cs.CR PDF

Benchmark MEDIUM

Quantifying Self-Preservation Bias in Large Language Models

Matteo Migliarini, Joaquin Pereira Pizzini, Luca Moresca +3 more

Instrumental convergence predicts that sufficiently advanced AI agents will resist shutdown, yet current safety training (RLHF) may obscure this risk...

1 months ago cs.AI PDF

Attack MEDIUM

AEGIS: Adversarial Entropy-Guided Immune System -- Thermodynamic State Space Models for Zero-Day Network Evasion Detection

Vickson Ferrel

As TLS 1.3 encryption limits traditional Deep Packet Inspection (DPI), the security community has pivoted to Euclidean Transformer-based classifiers...

1 months ago cs.CR cs.LG PDF

Tool MEDIUM

MTI: A Behavior-Based Temperament Profiling System for AI Agents

Jihoon Jeong

AI models of equivalent capability can exhibit fundamentally different behavioral patterns, yet no standardized instrument exists to measure these...

1 months ago cs.AI cs.CL PDF

Benchmark MEDIUM

From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers

Yiheng Huang, Zhijia Zhao, Bihuan Chen +5 more

The model context protocol (MCP) standardizes how LLMs connect to external tools and data sources, enabling faster integration but introducing new...

1 months ago cs.CR cs.SE PDF

Defense MEDIUM

Assertain: Automated Security Assertion Generation Using Large Language Models

Shams Tarek, Dipayan Saha, Khan Thamid Hasan +3 more

The increasing complexity of modern system-on-chip designs amplifies hardware security risks and makes manual security property specification a major...

1 months ago cs.CR PDF

Benchmark MEDIUM

Cooking Up Risks: Benchmarking and Reducing Food Safety Risks in Large Language Models

Weidi Luo, Xiaofei Wen, Tenghao Huang +5 more

Large language models (LLMs) are increasingly deployed for everyday tasks, including food preparation and health-related guidance. However, food...

1 months ago cs.CR PDF

Defense MEDIUM

ClawSafety: "Safe" LLMs, Unsafe Agents

Bowen Wei, Yunbei Zhang, Jinhao Pan +5 more

Personal AI agents like OpenClaw run with elevated privileges on users' local machines, where a single successful prompt injection can leak...

1 months ago cs.AI PDF

Defense MEDIUM

Safety, Security, and Cognitive Risks in World Models

Manoj Parmar

World models -- learned internal simulators of environment dynamics -- are rapidly becoming foundational to autonomous decision-making in robotics,...

1 months ago cs.CR cs.AI cs.LG PDF

Benchmark MEDIUM

SERSEM: Selective Entropy-Weighted Scoring for Membership Inference in Code Language Models

Kıvanç Kuzey Dikici, Serdar Kara, Semih Çağlar +2 more

As Large Language Models (LLMs) for code increasingly utilize massive, often non-permissively licensed datasets, evaluating data contamination...

1 months ago cs.SE cs.CR PDF

Defense MEDIUM

Multi-Agent LLM Governance for Safe Two-Timescale Reinforcement Learning in SDN-IoT Defense

Saeid Jamshidi, Negar Shahabi, Foutse Khomh +2 more

Software-Defined Networking (SDN) is increasingly adopted to secure Internet-of-Things (IoT) networks due to its centralized control and programmable...

1 months ago cs.CR PDF

Other MEDIUM

SCPatcher: Automated Smart Contract Code Repair via Retrieval-Augmented Generation and Knowledge Graph

Xiaoqi Li, Shipeng Ye, Wenkai Li +1 more

Smart contract vulnerabilities can cause substantial financial losses due to the immutability of code after deployment. While existing tools detect...

1 months ago cs.SE PDF

Benchmark MEDIUM

EnsembleSHAP: Faithful and Certifiably Robust Attribution for Random Subspace Method

Yanting Wang, Jinyuan Jia

Random subspace method has wide security applications such as providing certified defenses against adversarial and backdoor attacks, and building...

1 months ago cs.CR PDF

Attack MEDIUM

Performative Scenario Optimization

Quanyan Zhu, Zhengye Han

This paper introduces a performative scenario optimization framework for decision-dependent chance-constrained problems. Unlike classical stochastic...

1 months ago cs.GT PDF

Other MEDIUM

BotVerse: Real-Time Event-Driven Simulation of Social Agents

Edoardo Allegrini, Edoardo Di Paolo, Angelo Spognardi +1 more

BotVerse is a scalable, event-driven framework for high-fidelity social simulation using LLM-based agents. It addresses the ethical risks of studying...

1 months ago cs.SI cs.AI cs.MA PDF

Survey MEDIUM

Security in LLM-as-a-Judge: A Comprehensive SoK

Aiman Almasoud, Antony Anju, Marco Arazzi +6 more

LLM-as-a-Judge (LaaJ) is a novel paradigm in which powerful language models are used to assess the quality, safety, or correctness of generated...

1 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning

Yubo Li, Lu Zhang, Tianchong Jiang +2 more

Large language models systematically fail when a salient surface cue conflicts with an unstated feasibility constraint. We study this through a...

1 months ago cs.CL cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial