A Deterministic Control Plane for LLM Coding Agents
Padmaraj Madatha
LLM coding harnesses grant agents broad file and shell access, yet the configuration layer that steers them -- rules files, agent definitions,...
AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.
Showing 1–20 of 42 papers
Clear filtersPadmaraj Madatha
LLM coding harnesses grant agents broad file and shell access, yet the configuration layer that steers them -- rules files, agent definitions,...
Praneet Suresh, Jack Stanley, Sonia Joseph +2 more
Pre-trained transformers have demonstrated remarkable generalization abilities, at times extending beyond the scope of their training data. Yet,...
William Aiken, Paula Branco, Guy-Vincent Jourdan +1 more
Noise-based backdoor attacks on diffusion models typically rely on input-time trigger injection, untargeted activation, and out-of-distribution...
Poojitha Thota, Shirin Nilizadeh
Training-time data poisoning during fine-tuning poses a significant threat to large language models (LLMs) deployed for abstractive text...
Nasrin Malekzadeh Goradel, Niccolo Pancino, Yaser Gholizade Atani +3 more
Several theoretical works have tried to explain the adversarial vulnerability of deep neural networks through properties of high-dimensional...
Anastasiia Kucherenko, François Brouchoud, Dimitri Percia David +1 more
While the validity of LLMs' use in the legal context remains subject to ethical and legal debate, legal professionals are already experimenting with...
Juho Park, Hyunmin Choi, Kevin Nam
AI security agents increasingly rely on Retrieval-Augmented Generation (RAG) to use external security knowledge for vulnerability analysis and...
Yedidel Louck
LLM agents increasingly rely on persistent long-term memory, which creates a critical vulnerability that we study here: memory poisoning. An...
Hyunji Nam, Keertana Chidambaram, Dorottya Demszky +1 more
While in-context learning is generally shown to be effective in Large Language Models (LLMs), bad contexts can cause performance degradation and mode...
Matan Ben-Tov, Mahmood Sharif
Discrete text-trigger optimization -- searching for text sequences that, when ingested by a model, steer it toward a specified objective -- underpins...
Jaehyuk Jang, Minseok Seo. Seungju Cho, Kangwook Ko +1 more
Vision-language models (VLMs) achieve strong zero-shot recognition, but they remain highly vulnerable to adversarial perturbations. Recent test-time...
Prashanti Nilayam, Kiran Kumar Ramanna, Prashil Tumbade +1 more
Heterogeneous LLM debate is motivated by the promise that diverse peers correct one another, but the same exchange that carries correction also...
Zunchen Huang, Songgaojun Deng
Formal tools such as SAT and SMT solvers are increasingly embedded in language model reasoning pipelines when a safety or security critical question...
Yibin Hu, Xiaolin Sun, Zizhan Zheng
Model-based learning agents use learned world models to predict future states, plan actions, and adapt to new environments. However, the process of...
Mufei Li, Shikun Liu, Dongqi Fu +5 more
Post-hoc context erasing over the KV cache is challenging because a local edit has a global consequence: once a span has been processed, its...
Ziniu Liu, Aiping Li
When a person's records appear in k independent data silos, each protected by (epsilon, delta)-differential privacy, standard composition yields a...
Sipeng Xie, Qianhong Wu, Hengrun Lu +4 more
Agents increasingly access large language models (LLMs) through API routers. A router terminates the client's transport-layer security session and...
Hao-Hsuan Chen
Paper A defines a time-consistent actuarial runtime that prices each side-effect-bearing action against a contractually fixed safe default and gates...
Liuyang Yao, Zhouyu Li, Junguang He +1 more
AI systems are increasingly deployed for credit assessment and investment advisory in global financial markets, yet the integrity of their inference...
Achraf Hsain, Sultan Almuhammadi
Shielded reinforcement learning is typically presented as a runtime safety mechanism that compiles temporal-logic specifications into automata...
AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.
AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.
Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.
Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.
Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial