Attack HIGH
Yue Li, Xiao Li, Hao Wu +5 more
Large Language Models (LLMs) are increasingly used for automated software development, making their ability to preserve secure coding practices...
Yesterday cs.CR cs.SE
PDF
Attack HIGH
Huilin Zhou, Jian Zhao, Yilu Zhong +7 more
Red teaming is critical for uncovering vulnerabilities in Large Language Models (LLMs). While automated methods have improved scalability, existing...
Yesterday cs.LG cs.AI
PDF
Benchmark MEDIUM
Xia Hu, Zhenrui Yue, Brian Potetz +4 more
As current Multimodal Large Language Models rapidly saturate canonical visual reasoning benchmarks, a key question emerges: do these strong scores...
Yesterday cs.CV cs.AI
PDF
Attack MEDIUM
Ben Kereopa-Yorke, Guillermo Diaz, Holly Wright +3 more
We define Oracle Poisoning, an attack class in which an adversary corrupts a structured knowledge graph that AI agents query at runtime via tool-use...
2 days ago cs.CR cs.AI
PDF
Attack MEDIUM
Li Lixing
Modern large language models (LLMs) rely on system prompts to establish behavioral constraints and safety rules. Standard causal self-attention...
Survey HIGH
Monika Jotautaitė, Maria Angelica Martinez, Ollie Matthews +1 more
We introduce a red-teaming methodology that exposes harder-to-catch attacks for coding-agent monitors, suggesting that current practices may...
2 days ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Huy Hoang Ha, Benoit Favre, Francois Portet
Large language models (LLMs) have saturated standard medical benchmarks that test factual recall, yet their ability to perform higher-order...
2 days ago cs.CL cs.AI
PDF
Benchmark MEDIUM
Jingshen Zhang, Bo Wang, Yanlin Fu +4 more
In this paper, we study an emergent self-debiasing mechanisms against stereotypical content in Large Language Models (LLMs). Unlike traditional...
Attack HIGH
Yiyong Liu, Chia-Yi Hsu, Chun-Ying Huang +3 more
LLM-powered coding agents increasingly make software supply chain decisions. They generate imports, recommend packages, and write installation...
Tool MEDIUM
Michael A. Riegler, Inga Strümke
We present swarm-attack, an open-source adversarial testing framework in which multiple lightweight LLM agents coordinate through shared memory,...
2 days ago cs.CR cs.AI cs.LG
PDF
Benchmark MEDIUM
Yilin Zhang, Yingkai Hua, Chunyu Wei +2 more
Vision-language model (VLM) based web agents demonstrate impressive autonomous GUI interaction but remain vulnerable to deceptive interface elements....
2 days ago cs.AI cs.CR
PDF
Defense HIGH
Wenxin Tang, Xiang Zhang, Junliang Liu +11 more
Automated vulnerability detection is a fundamental task in software security, yet existing learning-based methods still struggle to capture the...
Benchmark HIGH
Shai Feldman, Yaniv Romano
Evaluating and predicting the performance of large language models (LLMs) in multi-turn conversational settings is critical yet computationally...
Attack MEDIUM
Isaac David, Arthur Gervais
Security updates create a short but important window in which defenders and attackers can compare vulnerable and patched software. Yet in many...
5 days ago cs.CR cs.AI
PDF
Benchmark HIGH
Mohammad Mamun, Mohamed Gaber, Scott Buffett +1 more
Language Model Agents (LMAs) are emerging as a powerful primitive for augmenting red-team operations. They can support attack planning, adversary...
Other LOW
Sneha Oram, Ojaswita Bhushan, Pushpak Bhattacharyya
In this work, we conduct an analysis to examine the consistency of Large Language Models (LLMs) with respect to their own generated responses in an...
Attack HIGH
Zeyuan Chen, Yihan Ma, Xinyue Shen +2 more
Large language models (LLMs) show strong performance across many applications, but their ability to memorize and potentially reveal training data...
Benchmark MEDIUM
Di Lu, Bo Zhang, Xiyuan Li +5 more
Self-hosted computer-use agents (SHCUAs), such as OpenClaw, combine natural-language interaction with direct access to host-side resources, including...
Defense LOW
Aleksandr Bowkis, Marie Davidsen Buhl, Jacob Pfau +1 more
A leading proposal for aligning artificial superintelligence (ASI) is to use AI agents to automate an increasing fraction of alignment research as...
Tool MEDIUM
Chengjie Wang, Jingzheng Wu, Xiang Ling +2 more
Large language models (LLMs) are now largely involved in software development workflows, and the code they generate routinely includes third-party...
5 days ago cs.SE cs.AI
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial