Undergrad Research Project - Visual Dialog

Fall 2017

Umang Bhatt
José Moura
Project description

The popularity of reinforcement learning (RL), a field of machine learning, has grown considerably in recent years. RL's applications are many. I hope to understand RL fundamentals and RL applications to natural, conversational language about visual content.

Visual Dialog is an AI task grounded in computer vision. It requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a question about the image, the agent has to ground the question in the image, infer context from history, and answer the question accurately. Visual Dialog is disentangled enough from a specific downstream task so as to serve as a general test of machine intelligence, while being grounded in vision enough to allow objective evaluation of individual responses and benchmark progress. With my partner, we will work under Professor Moura and his PhD student Satwik Kottur to extend the work of Visual Dialog and understand the fundamentals of RL, applying to the task of Visual Dialog.

Note: this research may pivot under the discretion and interests of Satwik and Professor Moura.

Return to project list