Main content

Dataset Augmentation for Aspect Level Sentiment Analysis

Show full item record

Title: Dataset Augmentation for Aspect Level Sentiment Analysis
Author: Sapru, Nikhil
Department: School of Engineering
Program: Engineering
Advisor: Taylor, Graham
Abstract: Deep learning models have been instrumental in achieving state-of-the-art performance on benchmark datasets across various tasks in computer vision as well as natural language processing. The success of these models has come as a result of the availability of large scale labeled datasets along with compute infrastructure to process that data. Deep neural networks perform poorly when there is a lack of labeled data to train them. Data augmentation is one of the methods used to boost model performance by creating new training data from previously labeled data. Improved data augmentation techniques would be an effective tool to expand smaller datasets to match the amount of labeled data required to maximize the performance of deep learning models. This would eventually help to reduce the time and cost to develop these models. However, for data augmentation techniques in the natural language processing domain, most of the developed methods are unique to a particular task or dataset. Therefore, there is a need to develop task agnostic data augmentation techniques in the domain of natural language processing. In this thesis, we the application of two recently known data augmentation techniques to a new task in natural language processing. Manifold mixup is a data augmentation technique for images developed by Verma et al. to act as a regularizer for neural network and SwitchOut is another technique developed by Wang et al. to boost the performance of neural networks used for machine translation. Through carefully designed experimentation, the thesis demonstrates the viability and usability of manifold mixup and SwitchOut as efficient task agnostic data augmentation techniques in natural language processing.
URI: http://hdl.handle.net/10214/17862
Date: 2020-04
Rights: Attribution 4.0 International
Terms of Use: All items in the Atrium are protected by copyright with all rights reserved unless otherwise indicated.


Files in this item

Files Size Format View Description
Sapru_Nikhil_202004_MASc.pdf 2.011Mb PDF View/Open MASc Thesis - Dataset Augmentation for Aspect Level Sentiment Analysis

This item appears in the following Collection(s)

Show full item record

Attribution 4.0 International Except where otherwise noted, this item's license is described as Attribution 4.0 International