Main content

Bayesian Clustering Approaches for Discrete Data

Show full item record

Title: Bayesian Clustering Approaches for Discrete Data
Author: Silva, H. Anjali
Department: Department of Mathematics and Statistics
Program: Bioinformatics
Advisor: Rothstein, Steven J.
Abstract: Unsupervised classification or clustering uses no a priori knowledge of the labels of the observations in the process of categorizing data. The research contained in this thesis focuses on the machine learning of discrete-valued gene expression datasets using clustering, with the aim of identifying gene co-expression networks. Specifically, a number of topics surrounding the use of mixture models and Markov chain Monte Carlo (MCMC) methods in clustering of discrete data from high-throughput transcriptome sequencing technologies is presented. After outlining current challenges and gaps in research with respect to clustering approaches, three mixture model-based clustering methods are presented: mixtures of multivariate Poisson-log normal distributions, mixtures of multivariate Poisson-log normal factor analyzers and mixtures of matrix-variate Poisson-log normal distributions. Significance, innovation, limitations and a number of future directions stemming from this research are discussed.
Date: 2018-03
Terms of Use: All items in the Atrium are protected by copyright with all rights reserved unless otherwise indicated.
Related Publications: Silva A., Rothstein, S. J. McNicholas, P. D. and Subedi, S. (2017) A Multivariate Poisson-Log Normal Mixture Model for Clustering Transcriptome Sequencing Data. arXiv preprint arXiv:1711.11190.

Files in this item

Files Size Format View Description
Silva_Anjali_201803_PhD.pdf 9.974Mb PDF View/Open Thesis

This item appears in the following Collection(s)

Show full item record