Benchmark MEDIUM relevance

REBAR: Reference Ethical Benchmark for Autonomy Readiness

Jonathan Diller David Barnes Rebekah Bogdanoff Rhett Collier Roddy Collins Keith Fieldhouse Yonatan Gefen Cameron Johnson Anuriha Kodali Brad Kriel Varun Murali James Niehaus Mish Sukharev Joseph VanPelt Anthony Hoogs Vijay Kumar Arslan Basharat

cs.RO cs.CY

Published

May 18, 2026

Updated

May 18, 2026

Links

PDF arxiv

Abstract

As autonomous systems grow more advanced, objective metrics to evaluate their ethical and legal compliance are critical for informing end users of their limitations and ensuring accountability of those who misuse them. Current ethical embodied AI frameworks remain mostly qualitative, focusing on system design (through safety guardrails or targeted red teaming), and the realized guardrails often directly disallow unsafe behavior without providing the user with an override or interpretable reason. Instead, there is a need for computable metrics through rigorous testing that allow a user to determine the applicability of the system to the task. To address this gap, we introduce the Reference Ethical Benchmark for Autonomy Readiness (REBAR), a quantitative test and evaluation framework for autonomous systems. REBAR maps operating metrics into a computable Autonomy Readiness Level (ARL) rubric that can quantify ethical performance. Key innovations of the framework include a neuro-symbolic Large Language Model (LLM) approach to calculate and explain the ethical difficulty of scenarios, LLM-driven at-scale generation of test instances, and a versatile, photorealistic simulation environment. By evaluating white-box autonomy solutions through this rigorous testing pipeline, REBAR delivers an objective and repeatable benchmark score, bridging the gap between abstract principles and verifiable, accountable autonomy.

Metadata

Comment: To be presented at the 2026 Workshop on Robot Ethics - Ethical, Legal and User Perspectives in Robotics and Automation (WOROBET)

Pro Analysis

Full threat analysis, ATLAS technique mapping, compliance impact assessment (ISO 42001, EU AI Act), and actionable recommendations are available with a Pro subscription.

Threat Deep-Dive

ATLAS Mapping

Compliance Reports

Actionable Recommendations

Start 14-Day Free Trial

Back to Research