How do we recognise objects and people? How can we catch a ball? How do we navigate our way from our desk to the coffee machine, without bumping into each other? These seemingly simple tasks have represented a challenge for AI scientists for decades. Recent developments in computer vision have seen significant improvement in important applications (face detection in cameras, body tracking, and autonomous cars).
This module will provide you with the fundamentals of computer vision, covering the essential challenges and key algorithms for solving a variety of vision problems. The course will provide both theoretical grounding in the relevant theories and a blend of classical and state-of-the-art approaches to computer vision problems. The course will focus on practical applications of computer vision and cover a broad range of problems, from low-level image processing to object recognition, tracking and 3D vision.
INTENDED LEARNING OUTCOMES (ILOs) (see assessment section below for how ILOs will be assessed)
On successful completion of this module you should be able to:
Module Specific Skills and Knowledge
1. Explain key computer vision problems and their mathematical formulation.
2. Design and implement vision algorithms in a high-level language.
Discipline Specific Skills and Knowledge
3. Analyse and propose solutions for computer vision problems.
4. Select appropriate statistical representations, features and algorithms to suit problem specificities.
Personal and Key Transferable / Employment Skills and Knowledge
5. Understand and appreciate the limitations of the state-of-the-art.
6. Critically read and report on research papers.
SYLLABUS PLAN - summary of the structure and academic content of the module
The course will cover the following topics:
Image formation: geometry, light, and cameras
Image processing: convolution, linear filters, Fourier transforms, image gradients, geometric transformations
Feature extraction & matching: corners, edges, blobs, and lines; feature descriptors (SIFT), feature matching and tracking
Object detection and recognition: K-NN, bag-of-words, scanning windows & Viola-Jones
Image segmentation: active contours, Markov random fields, graph cuts
Dense image correspondences: dense motion estimation, optical flow, stereo
Shape reconstruction: 2D and 3D shape modelling and fitting, active appearance models, 3D morphable models
3D vision: 3D pose estimation, calibration, structure from motion, SLAM, shape from shading, motion capture
Deep learning for vision: neural networks, convolutional neural networks, object detection, semantic and instance segmentation, recurrent neural networks