The hidden risk of one-size-fits-all AI advice

The hidden risk of one-size-fits-all AI advice

You’ve probably asked Why this changes everythingThis research fundamentally challenges how we think about AI safety. The authors propose a new framework called “User Welfare Safety,” focusing on whether AI-generated advice minimizes harm based on individual circumstances. It’s a shift from asking “what can this model do?” to “how does this model’s output affect specific people?”The implications extend beyond academic interest. The EU’s Digital Services Act and AI Act increasingly require platforms to assess risks to individual well-being. If ChatGPT reaches the user threshold to be designated a Very Large Online Service (it’s getting close at 41.3 million EU users), these vulnerability-stratified evaluations won’t just be nice to have. They’ll be legally required.The researchers acknowledge that implementing this at scale presents massive challenges. It requires a rich user context (raising privacy concerns) and access to real interaction data. But they’ve provided a methodological starting point, complete with code and datasets for others to build upon.What happens nextThis work reveals an uncomfortable reality: safety is relative, not absolute. A model that appears safe in benchmarks might be actively harmful to vulnerable populations in deployment. The gap between universal safety metrics and individual welfare isn’t just a measurement problem. It’s a fundamental challenge to how we build and deploy AI systems.As millions turn to AI for personal advice about their money, health, and major life decisions, we need evaluation frameworks that reflect this reality. The current approach of testing for universal risks while ignoring personalized harms amounts to what some critics call “safety-washing.” Models look safe on paper while posing real dangers to those who need help most.The researchers have given us both a warning and a path forward. Now it’s up to AI companies, regulators, and the broader community to decide whether we’ll keep measuring what’s easy or start measuring what matters.