In this work, we propose a way to integrate spatial commonsense knowledge to aid visual reasoning `external_link`.
In this work, we propose an exlicit reasoning layer over Deep Neural Architectures to solve VQA. `external_link`.
In this work, we propose image puzzles that require background knowledge to solve. This project is also aimed to advocates logically interpretable systems. `external_link`.
In this project, we aim to achieve a deeper understanding of images and videos with the help of background or common-sense knowledge. Our motivation is the way a human interprets images and videos, breaks them down in few important constituent parts, can reason about them, detect complex activities etc. `external_link`.