Skip to content

Can We Hold AI Accountable? The Case for Verifying What Machines Are Allowed to Do

Published: September 8, 2025

Published: September 8, 2025

The Need for AI Accountability

In January 2023, a TIME investigation revealed that Kenyan data workers labeling content for OpenAI’s ChatGPT were paid less than $2 per hour to review graphic and traumatic material. These workers, critical to the development of one of the world’s most influential AI systems, were largely invisible to the public and omitted from most narratives about AI innovation. Their story raises a foundational question: What does accountability mean in the age of artificial intelligence, and how do we ensure it is real, not rhetorical?

As AI systems become increasingly embedded in decisions that affect public welfare, education, healthcare, justice, and commerce, society must confront a central dilemma: machines do not bear responsibility. People do. But who, exactly? This article explores the growing movement to verify and regulate what AI systems are allowed to do — a movement necessary to preserve trust, equity, and safety in an AI-driven world.

The Accountability Crisis in AI

Artificial intelligence is no longer confined to science fiction or academic labs; it now influences nearly every facet of society. Yet as AI systems proliferate, accountability remains elusive. When a machine learning model generates biased outcomes or hallucinates false information, the lines of responsibility blur.

Companies often obscure the decision-making pipelines behind these models, masking not only the training data sources but also how content is curated. Even when flawed outputs emerge, explanations are vague, wrapped in technical jargon that distances developers from consequences.

As the Brookings Institution notes, this lack of transparency extends beyond algorithms to the people behind them. Many AI companies downplay the vast networks of human labor that enable these technologies to function. This invisibility shields both economic and ethical accountability.

Similarly, Bruce Schneier, a cybersecurity expert at Harvard’s Belfer Center, warns that users may emotionally trust AI systems as they would a friend. But these systems are neither neutral nor benevolent. They are optimized to collect personal data, shape behavior, and recommend outcomes in ways that may prioritize corporate interests over public well-being. When users misplace trust in AI, they open the door to manipulation — often without realizing it.

The accountability crisis is rooted not only in technical complexity but also in the lack of enforceable governance over those who develop and deploy AI. As a result, we are heading toward a future where powerful AI systems may operate without clear oversight, while the harm they cause is dismissed as unintended or untraceable.

What Is Formal Verification, and Why Does It Matter Now?

Formal verification — the process of mathematically proving that a system behaves as intended — has long been used in advanced fields like aviation and medical devices. Researchers hoped it could also prove AI systems safe, ensuring, for example, that a DNA synthesis machine cannot be misused or a chatbot cannot spread misinformation.

In practice, however, many critical AI threats are too complex to simulate with mathematical precision. Physical conditions shift, hardware can malfunction, and human use is unpredictable. As a result, formal verification is limited to narrow systems under controlled conditions — not large, open-ended AI applications.

What Happens When We Don’t Verify AI Systems

The cost of neglecting AI verification is already visible. For example, the COMPAS algorithm was shown to systematically rate Black defendants as higher risk than their white counterparts with similar profiles. The developers had no formal mechanism to identify or correct this racial bias.

Similarly, facial recognition tools have falsely identified individuals, leading to wrongful arrests. Predictive policing systems have reinforced discriminatory patterns. When AI fails, the damage often falls disproportionately on marginalized communities.

The World Economic Forum argues that public trust in AI is declining because many systems operate as “black boxes,” with little transparency about their inputs or rationale. Nearly half of U.S. respondents in a recent Pew Research Center survey expressed concern about AI’s growing presence.

What Should We Be Verifying in AI?

Effective AI verification clarifies both functionality and relevance. A model may achieve high overall accuracy yet still fail specific users, environments, or goals.

The first step is to assess functional correctness: does the system do what it was explicitly designed to do? But stopping there is insufficient. Verification should also evaluate fairness — whether the system produces equitable results across different groups. In practice, this means checking for patterns of discrimination or disparate impact.

Robustness is another essential dimension. AI systems must be able to handle variability and unexpected inputs without failure. Similarly, security verification ensures systems can resist manipulation, adversarial attacks, or misuse.

Lastly, explainability is critical for both trust and accountability. Can we understand how decisions are made? Can users, auditors, or regulators meaningfully interrogate the system’s reasoning?

These elements reflect a broader shift in the AI field: from evaluating models solely on performance metrics to evaluating them on their alignment with social, ethical, and legal standards. As AI continues to influence high-stakes domains like healthcare, education, and law, these verification dimensions are prerequisites for responsible deployment.

What Accountability Could Look Like

Ensuring real accountability in AI demands reimagining how institutions operate and how regulation is enforced.

The Brookings Institution proposes internal guardrails for organizations, such as requiring human validation of AI outputs, transparent disclosures of AI involvement, and strict compliance with privacy and fairness policies. These measures mirror broader calls for algorithmic impact assessments to evaluate potential harm before a system goes live.

Schneier pushes the argument further, advocating for government intervention to regulate the people and corporations behind AI. His recommendations include laws enforcing transparency, fiduciary roles for AI developers to protect user interests, and publicly developed AI models from non-profits or governments to ensure democratic accountability.

In practice, this means shifting responsibility from the abstract notion of “the algorithm” to human decision-makers. Accountability becomes not just an ethical goal, but a legal and operational standard.

Conclusion: If We Want Responsible AI, We Need Verifiable AI

Trust in AI must be designed, enforced, and earned — and verification is the foundation of that trust. In a landscape where opaque systems can shape elections, influence financial decisions, and affect social mobility, we cannot afford to treat trust as a byproduct.

This means asking not only whether an AI system functions as intended, but also whether it operates fairly, safely, and transparently in real-world contexts.

At UoPeople, we prepare students to meet these challenges. Our Computer Science curriculum goes beyond teaching code; it integrates ethical reasoning, social awareness, and critical analysis of technology’s impact. As students step into roles designing tomorrow’s systems, they carry a core principle: AI without verification is AI without accountability.

And accountability is non-negotiable. Without it, trust breaks down — and with it, the potential for AI to serve society equitably and responsibly.

Dr. Alexander Tuzhilin currently serves as Professor of Information Systems at the New York University (NYU) and Chair of the Department of Information, Operations and Management Sciences at Stern School of Business.
Read More