Scaling Up AI for Success

Decentralized approach to multi-agent reinforcement learning yields promising results, new research shows

Donna McKinnon - 24 February 2022

Fellow and Canada CIFAR AI Chair at Amii Matthew E. Taylor, associate professor in the Department of Computing Science in the University of Alberta’s Faculty of Science.

Fellow and Canada CIFAR AI Chair at Amii Matthew E. Taylor, associate professor in the Department of Computing Science in the University of Alberta’s Faculty of Science.

Reinforcement learning — a subset of artificial intelligence research in which an agent makes observations and takes actions based on a system of reward and punishment — is having more and more success in the real world. From game theory to robotics, reinforcement learning is teaching machines to optimize action and solve complex problems that cannot be solved using conventional methods.

The University of Alberta is a leader in reinforcement learning, says Fellow and Canada CIFAR AI Chair at Amii Matthew E. Taylor, associate professor in the Department of Computing Science in the University of Alberta’s Faculty of Science, but when multiple agents are involved, new approaches must be developed. Taylor, along with lead researcher Sriram Ganapathi Subramanian (University of Waterloo) and two other UWaterloo researchers, Mark Crowley and Pascal Poupart, is studying how multi-agent reinforcement learning can be scaled successfully to solve much larger problems using a decentralized mean field approach. 

“A lot of this work is in video games, which is fun,” says Taylor. “But we want to make algorithms that are practical and useful.”

The problem, Taylor explains, is that multi-agent reinforcement learning systems scale poorly as the number of agents grows. In their paper, Decentralized Mean Field Games, Taylor and Subramanian use real world datasets pertaining to large ride pool services (like Uber Pool and Lyft Line) to study strategic decision-making within interacting agents, identifying and then relaxing a number of strong assumptions in prior works.

One of these assumptions, explains Subramanian, is that individual agents are indistinguishable, with the same characteristics and goals. And yet, ride-sharing pools which use centralized services, often at some distance from the user, do not understand local dynamics such as the weather, local traffic patterns or individual preferences of users. 

“Think about having a thousand taxis working together to achieve the best customer service,” says Taylor. “The driver or the customer in their own decentralized locality are in a much better position to make decisions for themselves.”

The second strong assumption, adds Subramanian, is that agents, who may number in the millions, can access global information.

“It's like saying that every agent or every driver in this environment who is miles away from you, who has no communication with you, can actually pass some information to everyone else in this environment,” he says. “In our paper we say no, you don't have to assume people can actually share all the information.”

In relaxing these two assumptions in several simulated settings, as well as one real world setting (ride-shares), decentralized agents can learn different policies based on individual observations, yielding a more robust, consistent and easy to scale system.

The potential for this research is significant, says Taylor, with applications from smart grid problems to firefighting. 

“A decentralized approach to mean field games, we believe, is one way of taking a really exciting form of machine learning now being applied in the real world and getting it to work on much larger scale problems. There are a lot of interesting open directions.”

The paper, Decentralized Mean Field Games, will be presented at the 36th AAAI Conference on Artificial Intelligence, February 22 to March 1, 2022.