The Association for Advanced Artificial Intelligence (AAAI) 2025 workshop on Datasets and Evaluators for AI Safety on March 3rd, 2025, provided a forum to discuss the challenges and strategies in AI safety. Drawing on contributions from leading experts across academia, industry, and policy, the discussions revealed that advancing AI safety is as much about creating ethical, adaptive, and collaborative governance mechanisms as it is about embracing rigorous technical benchmarks.
Establishing robust standards for low-risk AI
The workshop highlighted the need for standardisation and reliable benchmarks to reduce AI risks. In his talk on “Building an Ecosystem for Reliable, Low-Risk AI,” Dr Peter Mattson, President of MLCommons, outlined the important role of standardised safety benchmarks. The introduction of frameworks like the AILuminate Benchmark, which evaluates AI systems across 12 distinct safety hazards (e.g., physical, non-physical, and contextual risks), has sparked a conversation on making safety evaluations robust and accessible. Such benchmarks provide a quantitative measure of risks (from privacy and intellectual property challenges to non-physical harms) and are a foundation for industry-wide standards.
The workshop also showcased metadata tools such as the Croissant vocabulary, which standardises ML datasets to improve reproducibility and transparency. These initiatives address a core barrier in verifying and replicating AI results by ensuring that datasets are FAIR (findable, accessible, interoperable, and reusable). Ultimately, the workshop showed that safer and more reliable AI deployments can be achieved through greater technical rigour in evaluation.
Balancing technical rigor with ethical considerations
Another significant insight from the workshop was the call to integrate ethical governance into technical evaluations. Participants highlighted that technical benchmarks must be complemented by sociotechnical frameworks that manage AI’s current fairness, inclusivity, and broader societal impacts. For instance, a presentation by Prof Virginia Dignum explored the tension between long-term speculative risks and immediate ethical challenges. Prof Dignum argued that an excessive focus on future harms might detract from addressing present-day issues such as bias and fairness.
Panel discussions underscored that effective AI safety requires both robust technical metrics and frameworks for social responsibility to tackle complex real-world environments. Without the latter, unforeseen biases, or worse, are more likely to be perpetuated when AI systems “in the wild” encounter safety challenges not anticipated during model training.
Adapting evaluation methodologies in a dynamic landscape
The rapidly evolving nature of AI technologies calls for a dynamic approach to safety evaluations. The workshop continuously illustrated that safety frameworks must be adaptive; what is deemed safe today may not hold in a future where AI systems evolve unexpectedly. Panelists agreed that traditional, static evaluation methods are often ill-equipped to handle the nuances of modern AI applications.
Emergent harms, manifesting as indirect or second-order effects, challenge conventional safety paradigms. Practitioners from multiple disciplinary backgrounds emphasised how all AI safety metrics require continuous re-evaluation and adjustment as models scale in complex environments. This interdisciplinary call to action reminds us that AI safety evaluations must be as flexible and dynamic as the technologies they are designed to safeguard.
Creating collaborative and interdisciplinary approaches
Underlying the discussions at AAAI 2025 was the recognition that addressing AI safety is inherently an interdisciplinary challenge. From data scientists and software engineers to policymakers and ethicists, the workshop brought together diverse experts, enumerating unique priorities in AI safety. For instance, during the panel discussions, some experts advocated for technical measures such as implementing self-customised safety filters and automated source-referencing. Meanwhile, others stressed governance priorities like a broad citizen-first approach and frameworks to safeguard vulnerable groups.
Integrating these viewpoints is critical for developing safety frameworks that are both technically sound and socially responsive. Collaborative discussions of this kind help to bridge the gap between theoretical research and practical implementation. Unsurprisingly, there was broad consensus that some form of collective action will drive the evolution of safer, more reliable AI systems. Initiatives such as the development of the Croissant vocabulary and the ongoing work on standardised benchmarks (e.g., AILuminate) effectively capture this spirit of cooperation, given both rely on cross-sector engagement to succeed.
Conclusion
Ultimately, several critical themes for the AAAI 2025 Datasets and Evaluators for AI Safety workshop were the need for adaptive evaluation frameworks to handle real-world complexity and ongoing interdisciplinary discussions to ensure that our notion of “risk” is as robust as possible. Many discussion points were held throughout the day, but these themes were particularly important for ongoing work such as MLCommons’ AI Risk and Reliability initiative. By the end of the event, we felt a desire for more interdisciplinary workshops like
this. The common theme of adaptability tells us that it will be critical for practitioners to stay abreast of the AI safety responses to fast-moving developments we continue to see across the ecosystem.