Seminar chung về Trí tuệ Nhân tạo và Khoa học Dữ liệu-Viện nghiên cứu cao cấp về toán

Seminar chung về Trí tuệ Nhân tạo và Khoa học Dữ liệu

Thông tin Talk 5

Speaker: Professor Jeff Edmonds, York University, Canada

Talk title: Reinforcement Learning Game Tree (báo cáo trực tiếp)

Time: 14:00 - 15:30, Thursday, October 27, 2022.

Seminar: Hybrid seminar (onsite at VIASM and online) [Registration here]

Abstract: The goal of Reinforcement Learning is to get an agent to learn how to solve some complex multi-step task, e.g. make a pina colada or win at Go. At the risk of being non-standard, Jeff will tell you the way he thinks about this topic. Both Game Trees and Markoff Chains; represent the graph of states through which your agent will traverse a path while completing the task. Suppose we could learn for each such state a value measuring; how good this state is for the agent. Then competing the task in an optimal way would be easy. If our current state is one within which our agent gets to choose the next action, then she will choose the action that maximizes the value of our next state. On the other hand, if our adversary gets to choose, he will choose the action that minimizes this value. Finally, if our current state is one within which the universe flips a coin, then each edge leaving this state will be labelled with the probability of taking it. Knowing that that is how the game is played, we can compute how good each state is. The states in which the task is complete is worth whatever reward the agent receives in the said state. These values somehow trickle backwards until we learn the value of the start state. The computational challenge is that there are way more states then we can ever look at.

Bio: Professor Jeff Edmonds received his PhD in 1992 at the University of Toronto. His thesis proved lower bounds on time-space tradeoffs. He did his post doctorate work at the ICSI in Berkeley on secure data transmission over networks for multi-media applications. He joined York University in 1995. More info about Prof. Jeff Edmonds is here.
For more information: https://lassonde.yorku.ca/users/jeff