Ted Sumers and Mark Ho
Date:
Q&A form: https://forms.gle/FZaYFKLqFNTMZYjDA
Speakers
Theodore Sumers is a third-year PhD student advised by Tom Griffiths at Princeton and supported by a NDSEG fellowship. His research uses reinforcement learning and decision theory to study human communication. Theoretically, he is interested in explaining how societies accumulate information over generations. Practically, he hopes to develop artificial systems capable of interacting with and learning from humans. Prior to beginning the PhD, he was a data scientist and engineering manager at Automatic Labs (2013-2014) and Uber (2014-2019).
Mark Ho is currently a postdoctoral researcher in the Computer Science and Psychology departments at Princeton University. His research combines ideas and methods from psychology, neuroscience, and computer science to identify design principles for interactive machine learning and to develop better models of human decision-making. He is particularly interested in interactions between human planning and social cognition, and how understanding the computational principles underlying these processes can inform the design of artificial agents. He received his Ph.D. in Cognitive Science and M.S. in Computer Science from Brown University as well as his B.A. in Philosophy from Princeton.
Speakers Links: Theodore Sumers Google Scholar - Website, Mark Ho Google Scholar - Website
Abstract
We’ll talk about methods to learn humans’ reward functions from natural language input. While robotics has historically focused on learning from natural language instructions, we find that humans prefer to teach with rich evaluative and descriptive feedback. To learn from this naturalistic language, we develop a new sentiment-analysis based approach: we decompose feedback into sentiment about the features of a Markov decision process. We then perform an analogue of Bayesian inverse reinforcement learning, regressing the sentiment on the features to infer the teacher’s latent reward function. Behavioral experiments validate that this agent successfully recovers humans’ reward functions from natural language input. Finally, we’ll cover more recent theoretical work that seeks to explain how a rational speaker should use these different forms of language.
Papers covered during the talk
Sumers, T. R., Ho, M. K., Hawkins, R. D., Narasimhan, K., & Griffiths, T. L. (2021). Learning rewards from linguistic feedback. feedback, 1(2), 3. AAAI-21 link
Sumers, Theodore R., et al. “Linguistic communication as (inverse) reward design.” arXiv preprint arXiv:2204.05091 (2022). ACL ‘22 - workshop on Learning from Natural Language Supervision link