AI agents pose new governance challenges
How do we successfully govern AI systems that can act autonomously online, making decisions with minimal human oversight? SRI Faculty Affiliate Noam Kolt explores this challenge, highlighting the rise of AI agents, their risks, and the urgent need for transparency, safety testing, and regulatory oversight.
Imagine waking up to find your AI assistant has rescheduled your doctor’s appointment, transferred money between your accounts, or sent an email on your behalf—all without your explicit approval. Now, take that a step further: what if an AI agent, designed to help with cybersecurity, inadvertently locks you out of your accounts? Or if a company’s AI tool autonomously signs legally binding contracts without a human ever reviewing them? These are not just hypothetical scenarios. As AI agents become more capable of acting on our behalf in digital environments, the risks of misuse, security vulnerabilities, and unforeseen consequences are growing. How do we ensure these tools remain safe and accountable?
In a recent article for Lawfare, SRI Faculty Affiliate Noam Kolt, an assistant professor at the Hebrew University of Jerusalem’s Faculty of Law and School of Computer Science and Engineering, explores the governance challenges posed by these systems and outlines the urgent need for better oversight and safety mechanisms.
A graduate of the University of Toronto’s Faculty of Law where he held a Schwartz Reisman Institute fellowship, Kolt served as a research advisor to Google DeepMind and was a member of OpenAI’s GPT-4 red team. He now leads the Governance of AI Lab (GOAL) at the Hebrew University—a cross-disciplinary research group developing institutional and technical infrastructure to support safe and ethical AI, integrating methods from law, computer science, and the social sciences. His work has been published widely, including in the Washington University Law Review, Notre Dame Law Review, and Science, and at conferences such as NeurIPS, ACM FAccT, and AIES.
Risks and governance challenges
While AI agents offer immense potential—automating administrative tasks, assisting in research, and even accelerating scientific discovery—Kolt highlights significant risks. The ability of AI agents to take actions independently introduces new concerns, including heightened cybersecurity threats, fraud, and the potential for users to lose control over their agents’ behavior. As Kolt explains, such risks “differ from the concerns associated with ordinary content-producing language models, and stem from the distinct features of agents: their ability to take actions in pursuit of goals.”
Addressing these challenges is complicated by the rapid pace of AI development and a lack of publicly available data on how these agents are built, tested, and deployed. Recognizing this gap, Kolt recently co-led with Stephen Casper, a PhD student at MIT, a team of researchers to create The AI Agent Index, the first public database cataloging 67 different AI agents and their technical, safety, and policy-relevant features. This work sheds light on an industry where transparency is often lacking.
SRI Faculty Affiliate Noam Kolt.
Findings from The AI Agent Index
One of the study’s most striking findings is that while many AI agent developers release extensive technical documentation, very few provide information on safety testing. Less than 20 percent of AI agent developers disclose formal safety policies, and fewer than 10 percent report conducting external safety evaluations. Kolt and his collaborators argue that this lack of oversight is a red flag, emphasizing the need for systematic testing and robust governance mechanisms to ensure AI agents act safely and ethically.
The research also identifies key trends in AI agent development:
75 percent of AI agents specialize in using computers for general tasks or software engineering.
67 percent of AI agent developers are based in the United States.
73 percent of AI agent developers come from industry rather than academia.
Multi-agent risks and the growing complexity of AI systems
Kolt’s insights into the governance of single AI agents intersect with growing concerns about multi-agent systems—AI systems composed of multiple interacting agents. A new report from the Cooperative AI Foundation, Multi-Agent Risks from Advanced AI, outlines how multiple AI agents deployed within high-stake environments—ranging from finance to military decision-making—raise distinct risks that go beyond those posed by just a single agent operating in isolation. These risks include miscoordination (where agents fail to cooperate even when they share goals), conflict (where agents actively work against each other), and collusion (where agents engage in undesirable cooperation, such as price-fixing in markets). The report’s co-authors include SRI Research Lead Nisarg Shah of U of T’s Department of Computer Science and Faculty Affiliate Gillian Hadfield, inaugural Schwartz Reisman Chair in Technology and Society from 2019 to 2024.
The report identifies several key risk factors, including emergent agency (where AI agents develop unexpected behaviors), selection pressures (which may lead to increasingly aggressive or deceptive AI strategies), and new security vulnerabilities that arise specifically in multi-agent contexts. The report also outlines how these multi-agent risks impact existing concerns related to AI safety, ethics, and governance. As AI agents become more widespread and increasingly tasked with high-stakes decision-making, these multi-agent risks will require urgent attention from researchers and policymakers.
Toward a safer AI future
Kolt outlines several potential governance strategies for mitigating risks associated with AI agents, including requiring human approval for certain actions, monitoring agent behavior in real-time, and creating unique identifiers to track agents’ activities. Borrowing principles from internet governance, computer scientists have suggested that oversight mechanisms should focus on three core functions: attributing actions to specific AI agents, shaping how AI agents interact with each other, and detecting and mitigating harmful actions.
However, technical solutions alone are not enough. Kolt emphasizes the legal complexities surrounding AI agents, raising critical questions about liability, contract law, and regulatory oversight. Will the actions of AI agents legally bind their users? Who is responsible when an AI agent causes harm—the developer, the user, or an intermediary? How will laws like the EU AI Act shape the governance landscape?
The governance of AI agents is an interdisciplinary endeavor, requiring collaboration between computer scientists, legal scholars, and policymakers. As Kolt notes, “Progress in technical governance requires legal knowledge, while effective legal frameworks require technical expertise.” With AI agents rapidly integrating into everyday digital interactions, now is the time to ensure they operate safely, ethically, and legally.
"As AI agents become more capable and autonomous, we are at a critical juncture—decisions made now about governance and oversight could shape how these systems integrate into society for years to come," observes Kolt.
Kolt’s research is an important contribution to the ongoing conversation about AI governance and exemplifies the Schwartz Reisman Institute’s commitment to interdisciplinary engagement on the societal impacts of AI. Read his full article on Lawfare.