Main content

Spatiotemporal Representation Learning For Human Action Recognition And Localization

Show full item record

Title: Spatiotemporal Representation Learning For Human Action Recognition And Localization
Author: Ali, Alaaeldin
Department: School of Engineering
Program: Engineering
Advisor: Taylor, Graham
Abstract: Human action understanding from videos is one of the foremost challenges in computer vision. It is the cornerstone of many applications like human-computer interaction and automatic surveillance. The current state of the art methods for action recognition and localization mostly rely on Deep Learning. In spite of their strong performance, Deep Learning approaches require a huge amount of labeled training data. Furthermore, standard action recognition pipelines rely on independent optical flow estimators which increase their computational cost. We propose two approaches to improve these aspects. First, we develop a novel method for efficient, real-time action localization in videos that achieves performance on par or better than other more computationally expensive methods. Second, we present a self-supervised learning approach for spatiotemporal feature learning that does not require any annotations. We demonstrate that features learned by our method provide a very strong prior for the downstream task of action recognition.
Date: 2019-09
Terms of Use: All items in the Atrium are protected by copyright with all rights reserved unless otherwise indicated.

Files in this item

Files Size Format View
ali_alaaeldin_201909_MSc.pdf 10.66Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record