Computer Science: COMS W 4995 011
COMS W 4995 011 (3 pts)
Peter Belhumeur pb2019 C002442097
Recent advances in Deep Learning have propelled Computer Vision forward. Applications such as image recognition and search, object detection, semantic segmentation, unconstrained face recognition, synthetic image generation, image generation from text, and image and video captioning all of which only recently seemed decades off, are now being realized and deployed at scale everywhere. This course will look at the advances in computer vision and machine learning that have made this possible. In particular we will look at Convolutional Neural Nets (CNNs), Transformers, and Vision Transformers and their application to computer vision and natural language processing. We will also look at the datasets needed to feed these data hungry approaches--both how to create them and how to leverage them to address a wider range of applications. The course will have homework assignments and a final project; there will be no exams. Enrollment capped at 120.