Benchmark MEDIUM
Runze Cui, Fangxin Shang, Yehui Yang +2 more
Document understanding is a critical capability in financial credit review, onboarding, and remote verification, where both decision accuracy and...
2 weeks ago cs.CV cs.CE cs.MM
PDF
Benchmark MEDIUM
Yuanfan Li, Qi Zhou, Chengzhengxu Li +5 more
We present MGTEVAL, an extensible platform for systematic evaluation of Machine-Generated Text (MGT) detectors. Despite rapid progress in MGT...
2 weeks ago cs.CR cs.CL
PDF
Defense MEDIUM
Ravikumar Balakrishnan, Sanket Mendapara
Typographic prompt injection exploits vision language models' (VLMs) ability to read text rendered in images, posing a growing threat as VLMs power...
Attack HIGH
Miles Q. Li, Benjamin C. M. Fung, Boyang Li +2 more
Existing white-box jailbreak attacks against aligned LLMs typically append discrete adversarial suffixes to the user prompt, which visibly alters the...
Attack HIGH
Allen Jue
Learned index structures achieve high performance by modeling the cumulative distribution function (CDF) of keys, but this reliance on data...
2 weeks ago cs.CR cs.DB
PDF
Survey MEDIUM
Xiaohang Yu, Hejia Geng, William Knottenbelt
Agentic systems increasingly act with user secrets for APIs, messaging platforms, and cloud services. Today's bearer-secret interfaces implement...
2 weeks ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Aaron J. Li, Nicolas Sanchez, Hao Huang +8 more
Large language models (LLMs) are increasingly deployed, yet their outputs can be highly sensitive to routine, non-adversarial variation in how users...
2 weeks ago cs.CL cs.AI
PDF
Benchmark LOW
German Marin, Jatin Chaudhary
Autonomous AI agents can remain fully authorized and still become unsafe as behavior drifts, adversaries adapt, and decision patterns shift without...
Benchmark MEDIUM
Qi Li, Jiu Li, Pingtao Wei +8 more
This report presents a comparative evaluation of DKnownAI Guard in AI agent security scenarios, benchmarked against three competing products: AWS...
2 weeks ago cs.CR cs.AI
PDF
Defense MEDIUM
Nay Myat Min, Long H. Pham, Jun Sun
Large language models deployed at runtime can misbehave in ways that clean-data validation cannot anticipate: training-time backdoors lie dormant...
2 weeks ago cs.CR cs.AI cs.CL
PDF
Benchmark MEDIUM
Pablo Mateo-Torrejón, Alfonso Sánchez-Macián
The rapid integration of Large Language Models (LLMs) into Multi-Agent Systems (MAS) has significantly enhanced their collaborative problem-solving...
2 weeks ago cs.CR cs.AI cs.MA
PDF
Survey MEDIUM
Zihan Liu, Yizhen Wang, Rui Wang +2 more
Fine-tuning unlocks large language models (LLMs) for specialized applications, but its high computational cost often puts it out of reach for...
2 weeks ago cs.CR cs.CL cs.DC
PDF
Attack MEDIUM
Mengnan Zhao, Lihe Zhang, Tianhang Zheng +2 more
Fast Adversarial Training (FAT) has attracted significant attention due to its efficiency in enhancing neural network robustness against adversarial...
2 weeks ago cs.LG cs.AI cs.CR
PDF
Tool LOW
Zheng Wu, Yi Hua, Zhaoyuan Huang +8 more
The evolution of Multimodal Large Language Models (MLLMs) has shifted the focus from text generation to active behavioral execution, particularly via...
Benchmark MEDIUM
Zijun Feng, Yuming Feng, Yu Wang +4 more
Cross-chain bridges, the critical infrastructure of the multi-chain ecosystem, have become a primary target for attackers, resulting in over $2.8...
Attack MEDIUM
Mengnan Zhao, Lihe Zhang, Bo Wang +3 more
Fast Adversarial Training (FAT) has proven effective in enhancing model robustness by encouraging networks to learn perturbation-invariant...
2 weeks ago cs.LG cs.CR
PDF
Benchmark MEDIUM
Víctor Mayoral-Vilches, María Sanz-Gómez, Francesco Balassone +6 more
As LLM-driven agents advance in cybersecurity, Jeopardy CTF benchmarks are approaching saturation and cyber ranges, the natural next evaluation...
Defense MEDIUM
Kaisheng Fan, Weizhe Zhang, Yishu Gao +2 more
Defending against backdoor attacks in large language models remains a critical practical challenge. Existing defenses mitigate these threats but...
2 weeks ago cs.CR cs.AI
PDF
Attack HIGH
Zonghao Ying, Haozheng Wang, Jiangfan Liu +5 more
Large Language Model (LLM) agents are increasingly used to automate complex workflows, but integrating untrusted external data with privileged...
Tool LOW
Jeffrey Wong, Antoine Creux
Create an idea, prototype it, evaluate if users like it, then learn. It is the circle of business. If AI can operate in all parts of the circle, it...
2 weeks ago cs.SE cs.MS stat.AP
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial