Main content

An effective positive-unlabeled learning method for detecting a large scale of malware variants

Show full item record

Title: An effective positive-unlabeled learning method for detecting a large scale of malware variants
Author: Khan, Mohammad Faham
Department: School of Computer Science
Program: Computer Science
Advisor: Lin, Xiaodong
Abstract: Malicious softwares (Malwares) are able to quickly evolve into many different variants and evade various existing detection techniques. Machine learning based techniques perform well in detecting malware variants, but in the real industry, the volume of malware variants grows fast and labelling data takes a lot of labour. Thus companies tend to label a small part of the malware samples and treat the rest of the unlabeled samples as benign samples, which leads to limited accuracy. To address such a problem, in this thesis, we propose a cost-sensitive boosting method to train a detection model with the malicious-unlabeled executables to improve the accuracy. Extensive experiments have demonstrated that the proposed method, when implemented into the machine learning algorithms (with positive and unlabeled datasets), has shown to improve the final results. It improved the reliability of the machine learning models, and during the training period, it improved the speed, convergence etc.
URI: https://hdl.handle.net/10214/23715
Date: 2021-01
Terms of Use: All items in the Atrium are protected by copyright with all rights reserved unless otherwise indicated.
Related Publications: J. Zhang, M. F. Khan, X. Lin and Z. Qin, "An Optimized Positive-Unlabeled Learning Method for Detecting a Large Scale of Malware Variants," 2019 IEEE Conference on Dependable and Secure Computing (DSC), Hangzhou, China, 2019, pp. 1-8, doi: 10.1109/DSC47296.2019.8937650.


Files in this item

Files Size Format View Description
Khan_Mohammad_202011_Msc.pdf 1.682Mb PDF View/Open Mohammad Faham Khan Thesis

This item appears in the following Collection(s)

Show full item record