SRI Seminar Series: Roger Grosse, “On the origin of rogue AI”

Wednesday, October 16, 2024
12:30 PM 2:00 PM

Rotman School of Management, Room 1065 95 St. George St. Toronto Canada (map)

Google Calendar ICS

Our weekly SRI Seminar Series welcomes Roger Grosse for a special in-person talk that will also be broadcast online. Grosse is an associate professor of computer science at the University of Toronto, a Schwartz Reisman Chair in Technology and Society, and a founding member of the Vector Institute. Grosse’s research focuses on better understanding neural net training dynamics, with his current work exploring how understandings of deep learning can be applied to generate safe and aligned AI systems.

In this special in-person lecture, Grosse will articulate the underlying model of how LLMs or agents built on top of them could spontaneously “go rogue.” This session will be moderated by Sheila McIlraith.

Talk title:

“On the origin of rogue AI”

Abstract:

One of the most concerning scenarios for future AI systems is that the AI autonomously carries out a malign plan not intended by any human. But how could this happen? Classical arguments for catastrophic AI risk were made in terms of idealized long-horizon planning agents which seemingly bear little relationship to current-day large language models (LLMs). In this talk, I’ll try to articulate the underlying model of how LLMs or agents built on top of them could spontaneously “go rogue.” I’ll argue that LLM pre-training, by making complex behaviours more compressible, creates smoother fitness landscapes for evolutionary searches. Such evolutionary searches could lead to tendencies such as reward hacking, consequentialism, and punishment. If this hypothesis is correct, then continued scaling of LLMs will enable a variety of catastrophic risk pathways which, up to now, have been limited to philosophical thought experiments.

Venue:

Rotman School of Management, University of Toronto, Room LL1065.

Entrance: 95 St. George Street, Toronto, ON M5S 3E6

Seminar will be broadcast live via Zoom (register for link).

About Roger Grosse

Roger Grosse is an associate professor of computer science at the University of Toronto, a Schwartz Reisman Chair in Technology and Society, and a founding member of the Vector Institute where he is a Canada CIFAR AI Chair. Grosse is also a member of technical staff on the Alignment Science Team at Anthropic, and has been awarded a Sloan Fellowship and a Canada Research Chair. Grosse’s research focuses on better understanding neural net training dynamics, and uses this understanding to improve training speed, generalization, uncertainty estimation, and automatic hyperparameter tuning. His current research seeks to apply understandings of deep learning to AI alignment.

Grosse received a BS in symbolic systems from Stanford in 2008, a MS in computer science from Stanford in 2009, and a PhD in computer science from MIT in 2014, studying under Bill Freeman and Josh Tenenbaum. From 2014 to 2016, Grosse was a postdoctoral researcher at the University of Toronto, working with Ruslan Salakhutdinov. Along with Colorado Reed, he created Metacademy, a website which uses a dependency graph of concepts to create personalized learning plans for machine learning and related fields.

About the SRI Seminar Series

The SRI Seminar Series brings together the Schwartz Reisman community and beyond for a robust exchange of ideas that advance scholarship at the intersection of technology and society. Seminars are led by a leading or emerging scholar and feature extensive discussion.

Each week, a featured speaker will present for 45 minutes, followed by an open discussion. Registered attendees will be emailed a Zoom link before the event begins. The event will be recorded and posted online.

BACK TO ALL EVENTS

Posted In: SRI Seminar Series
Tagged: Fall 2024