Defense MEDIUM relevance

SafeMed-R1: Clinician-Audited Safety and Ethics Alignment for Medical Large Language Models

Chao Ding Mouxiao Bian Tianbin Li Minjia Yuan Yidong Jiang Yankai Jiang Jinru Ding Jiayuan Chen Zhuangzhi Gao Pengcheng Chen Zhao He Rongzhao Zhang Meiling Liu Luyi Jiang Jie Xu

cs.AI

Published

May 27, 2026

Updated

May 27, 2026

Links

PDF arxiv

Abstract

Large language models(LLMs) increasingly match expert performance on licensing examinations, yet routine clinical use remains limited because governance requires auditable reasoning, safety and ethics alignment, and resilience to adversarial misuse. Here we present SafeMed-R1, trained with a traceable Clinical Trust Signals(CTS) pipeline that links each reasoning instance to clinician rubric scores and edit histories, and aligned through safety and ethics supervision and red team stress testing. SafeMed-R1 attains a macro-averaged accuracy of 79.6% across clinical benchmarks. Under adversarial safety testing, it shows the lowest aggregated risk and reduces unsafe outputs by about 3 to 5% relative to its baseline. In a paired expert study of 30 medication safety vignettes, SafeMed-R1 matches PGY1 and PGY2 residents on medical correctness and scores higher for medication safety, guideline consistency, and clinical usefulness. Collectively, these results suggest that clinician-audited supervision provenance, together with domain-tailored safety and ethics alignment, can strengthen governance-relevant evidence without relying on inference-time retrieval or citation grounding.

Pro Analysis

Full threat analysis, ATLAS technique mapping, compliance impact assessment (ISO 42001, EU AI Act), and actionable recommendations are available with a Pro subscription.

Threat Deep-Dive

ATLAS Mapping

Compliance Reports

Actionable Recommendations

Start 14-Day Free Trial

Back to Research