WAREX: Web Agent Reliability Evaluation on Existing Benchmarks
Su Kara, Fazle Faisal, Suman Nath
Recent advances in browser-based LLM agents have shown promise for automating tasks ranging from simple form filling to hotel booking or online...
AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.
Showing 1421–1440 of 1,455 papers
Clear filtersSu Kara, Fazle Faisal, Suman Nath
Recent advances in browser-based LLM agents have shown promise for automating tasks ranging from simple form filling to hotel booking or online...
Yihan Wu, Ruibo Chen, Georgios Milis +1 more
As large language models become increasingly capable and widely deployed, verifying the provenance of machine-generated content is critical to...
Gauri Kholkar, Ratinder Ahuja
As autonomous AI agents are used in regulated and safety-critical settings, organizations need effective ways to turn policy into enforceable...
Meet Udeshi, Venkata Sai Charan Putrevu, Prashanth Krishnamurthy +4 more
Security of software supply chains is necessary to ensure that software updates do not contain maliciously injected code or introduce vulnerabilities...
Shuyi Lin, Tian Lu, Zikai Wang +3 more
OpenAI's GPT-OSS family provides open-weight language models with explicit chain-of-thought (CoT) reasoning and a Harmony prompt format. We summarize...
Sihan Hu, Xiansheng Cai, Yuan Huang +5 more
Training large language models with Reinforcement Learning with Verifiable Rewards (RLVR) exhibits a set of distinctive and puzzling behaviors that...
Sherif Saad, Kevin Shi, Mohammed Mamun +1 more
Automated machine learning (AutoML) has emerged as a promising paradigm for automating machine learning (ML) pipeline design, broadening AI adoption....
Yuqiao Meng, Luoxi Tang, Feiyang Yu +4 more
Large language models (LLMs) are increasingly used to help security analysts manage the surge of cyber threats, automating tasks from vulnerability...
Luxuan Zhang, Douglas Jiang, Qinglong Wang +2 more
Large language models (LLMs) have shown strong ability in generating rich representations across domains such as natural language processing and...
Zeyu Shen, Basileal Imana, Tong Wu +3 more
Retrieval-Augmented Generation (RAG) enhances Large Language Models by grounding their outputs in external documents. These systems, however, remain...
Charles E. Gagnon, Steven H. H. Ding, Philippe Charland +1 more
Binary code similarity detection is a core task in reverse engineering. It supports malware analysis and vulnerability discovery by identifying...
Han Yan, Zheyuan Liu, Meng Jiang
With the rapid advancement of large language models, Machine Unlearning has emerged to address growing concerns around user privacy, copyright...
Xiaotian Zou
Multimodal Large Language Models (MLLMs) have transformed text-to-image workflows, allowing designers to create novel visual concepts with...
Jeongyeon Hwang, Sangdon Park, Jungseul Ok
Watermarking offers a promising solution for detecting LLM-generated content, yet its robustness under realistic query-free (black-box) evasion...
Xingyu Li, Juefei Pu, Yifan Wu +13 more
Open-source software projects are foundational to modern software ecosystems, with the Linux kernel standing out as a critical exemplar due to its...
Antreas Ioannou, Andreas Shiamishis, Nora Hollenstein +1 more
In an era dominated by Large Language Models (LLMs), understanding their capabilities and limitations, especially in high-stakes fields like law, is...
Nakyeong Yang, Dong-Kyum Kim, Jea Kwon +3 more
Large language models trained on web-scale data can memorize private or sensitive knowledge, raising significant privacy risks. Although some...
Haochen Gong, Chenxiao Li, Rui Chang +1 more
Large language model (LLM)-based computer-use agents represent a convergence of AI and OS capabilities, enabling natural language to control system-...
Jiayu Ding, Xinpeng Liu, Zhiyi Pan +2 more
Lifting 2D open-vocabulary understanding into 3D Gaussian Splatting (3DGS) scenes is a critical challenge. However, mainstream methods suffer from...
Lukas Twist, Jie M. Zhang, Mark Harman +1 more
Large language models (LLMs) are increasingly used to generate code, yet they continue to hallucinate, often inventing non-existent libraries. Such...
AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.
AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.
Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.
Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.
Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial