Backdooring Bias in Large Language Models
model, with an adversary targeting the model builder's LLM. However, in the bias manipulation setting, the model builder themselves could be the adversary, warranting a white-box threat model
Zero-Knowledge Federated Learning with Lattice-Based Hybrid Encryption for Quantum-Resilient Medical AI
models across hospitals without centralizing patient data. However, the exchange of model updates exposes critical vulnerabilities: gradient inversion attacks can reconstruct patient information, Byzantine clients can poison the global model
Collaborative penetration testing suite for emerging generative AI algorithms
Problem Space: AI Vulnerabilities and Quantum Threats Generative AI vulnerabilities: model inversion, data poisoning, adversarial inputs. Quantum threats Shor Algorithm breaking RSA ECC encryption. Challenge Secure generative AI models against
Inference-Time Backdoors via Hidden Instructions in LLM Chat Templates
model processing. We show that an adversary who distributes a model with a maliciously modified template can implant an inference-time backdoor without modifying model weights, poisoning training data
Data Poisoning Vulnerabilities Across Healthcare AI Architectures: A Security Threat Analysis
analyses needed for detection. Supply chain weaknesses allow a single compromised vendor to poison models across 50 to 200 institutions. The Medical Scribe Sybil scenario shows how coordinated fake patient
Who Speaks for the Trigger? Dynamic Expert Routing in Backdoored Mixture-of-Experts Transformers
these targeted experts. Unlike traditional backdoor attacks that rely on superficial data poisoning or model editing, BadSwitch primarily embeds malicious triggers into expert routing paths with strong task affinity, enabling
Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs
oriented assistants. This growing ecosystem also raises supply-chain risks, where adversaries can distribute poisoned models that degrade downstream reliability and user trust. Existing backdoor attacks and defenses are largely
Future Mining: Learning for Safety and Security
emerging cyber physical threats such as backdoor triggers, sensor spoofing, label flipping attacks, and poisoned model updates further jeopardize operational safety as mines adopt autonomous vehicles, humanoid assistance, and federated
Scalable Hierarchical AI-Blockchain Framework for Real-Time Anomaly Detection in Large-Scale Autonomous Vehicle Networks
different attack types, such as sensor spoofing, jamming, and adversarial model poisoning, are conducted to test the scalability and resiliency of HAVEN. Experimental findings show sub-10 ms detection latency
Concept-Guided Backdoor Attack on Vision Language Models
first, Concept-Thresholding Poisoning (CTP), uses explicit concepts in natural images as triggers: only samples containing the target concept are poisoned, causing the model to behave normally in all other
Serverless AI Security: Attack Surface Analysis and Runtime Protection Mechanisms for FaaS-Based Machine Learning
characterize the attack surface across five categories: function-level vulnerabilities (cold start exploitation, dependency poisoning), model-specific threats (API-based extraction, adversarial inputs), infrastructure attacks (cross-function contamination, privilege escalation
Web Technologies Security in the AI Era: A Survey of CDN-Enhanced Defenses
mitigate while reducing data movement and enhancing compliance, yet introduces new risks around model abuse, poisoning, and governance
Secure Retrieval-Augmented Generation against Poisoning Attacks
Large language models (LLMs) have transformed natural language processing (NLP), enabling applications from content generation to decision support. Retrieval-Augmented Generation (RAG) improves LLMs by incorporating external knowledge but also
MemoryGraft: Persistent Compromise of LLM Agents via Poisoned Experience Retrieval
Large Language Model (LLM) agents increasingly rely on long-term memory and Retrieval-Augmented Generation (RAG) to persist experiences and refine future performance. While this experience learning capability enhances agentic
FLARE: Adaptive Multi-Dimensional Reputation for Robust Client Reliability in Federated Learning
learning (FL) enables collaborative model training while preserving data privacy. However, it remains vulnerable to malicious clients who compromise model integrity through Byzantine attacks, data poisoning, or adaptive adversarial behaviors
From Theory to Practice: Evaluating Data Poisoning Attacks and Defenses in In-Context Learning on Social Media Health Discourse
This study explored how in-context learning (ICL) in large language models can be disrupted by data poisoning attacks in the setting of public health sentiment analysis. Using tweets
Cisco Integrated AI Security and Safety Framework Report
threats now span content safety failures (e.g., harmful or deceptive outputs), model and data integrity compromise (e.g., poisoning, supply-chain tampering), runtime manipulations (e.g., prompt injection, tool and agent misuse
SCOUT: A Defense Against Data Poisoning Attacks in Fine-Tuned Language Models
Backdoor attacks create significant security threats to language models by
Stealthy Poisoning Attacks Bypass Defenses in Regression Settings
natural and physical sciences, yet their robustness to poisoning has received less attention. When it has, studies often assume unrealistic threat models and are thus less useful in practice
State Backdoor: Towards Stealthy Real-world Poisoning Attack on Vision-Language-Action Model in State Space
Vision-Language-Action (VLA) models are widely deployed in safety