Dual-Space Smoothness for Robust and Balanced LLM Unlearning
Han Yan, Zheyuan Liu, Meng Jiang
With the rapid advancement of large language models, Machine Unlearning has emerged to address growing concerns around user privacy, copyright...
AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.
Showing 381–388 of 388 papers
Clear filtersHan Yan, Zheyuan Liu, Meng Jiang
With the rapid advancement of large language models, Machine Unlearning has emerged to address growing concerns around user privacy, copyright...
Jeongyeon Hwang, Sangdon Park, Jungseul Ok
Watermarking offers a promising solution for detecting LLM-generated content, yet its robustness under realistic query-free (black-box) evasion...
Xingyu Li, Juefei Pu, Yifan Wu +13 more
Open-source software projects are foundational to modern software ecosystems, with the Linux kernel standing out as a critical exemplar due to its...
David Benfield, Stefano Coniglio, Phan Tu Vuong +1 more
Adversarial machine learning concerns situations in which learners face attacks from active adversaries. Such scenarios arise in applications such as...
Miao Yu, Zhenhong Zhou, Moayad Aloqaily +5 more
Fine-tuned Large Language Models (LLMs) are vulnerable to backdoor attacks through data poisoning, yet the internal mechanisms governing these...
Jiahao Huo, Shuliang Liu, Bin Wang +5 more
Semantic-level watermarking (SWM) for large language models (LLMs) enhances watermarking robustness against text modifications and paraphrasing...
Anh Tu Ngo, Anupam Chattopadhyay, Subhamoy Maitra
In this paper we show that cryptographic backdoors in a neural network (NN) can be highly effective in two directions, namely mounting the attacks as...
Xiaofan Li, Xing Gao
In recent years, various software supply chain (SSC) attacks have posed significant risks to the global community. Severe consequences may arise if...
AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.
AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.
Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.
Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.
Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial