The widespread adoption of Vision Transformers (ViTs) elevates supply-chain risk on third-party model hubs, where an adversary can implant backdoors...
Sidahmed Benabderrahmane, Petko Valtchev, James Cheney +1 more
Detecting rare and diverse anomalies in highly imbalanced datasets-such as Advanced Persistent Threats (APTs) in cybersecurity-remains a fundamental...
Modern artificial intelligence (AI) models are deployed on inference engines to optimize runtime efficiency and resource allocation, particularly for...
Federated self-supervised learning (FSSL) enables collaborative training of self-supervised representation models without sharing raw unlabeled data....
Ali Mahdavi, Santa Aghapour, Azadeh Zamanifar +1 more
Existing Byzantine robust aggregation mechanisms typically rely on fulldimensional gradi ent comparisons or pairwise distance computations, resulting...
Vision Language Action (VLA) models close the perception action loop by translating multimodal instructions into executable behaviors, but this very...
Vision Language Action (VLA) models close the perception action loop by translating multimodal instructions into executable behaviors, but this very...
Multimodal Large Language Models (MLLMs) have shown remarkable proficiency on general-purpose vision-language benchmarks, reaching or even exceeding...