Mixtures of Skew-t Factor Analyzers

Murray, Paula

Mixtures of Skew-t Factor Analyzers

Files

Murray_Paula_201212_Msc.pdf (927.74 KB)

Date

2012-11-01

Authors

Murray, Paula

Publisher

University of Guelph

Abstract

Model-based clustering allows for the identification of subgroups in a data set through the use of finite mixture models. When applied to high-dimensional microarray data, we can discover groups of genes characterized by their gene expression profiles. In this thesis, a mixture of skew-t factor analyzers is introduced for the clustering of high-dimensional data. Notably, we make use of a version of the skew-t distribution which has not previously appeared in mixture-modelling literature. Allowing a constraint on the factor loading matrix leads to two mixtures of skew-t factor analyzers models. These models are implemented using the alternating expectation-conditional maximization algorithm for parameter estimation with an Aitken's acceleration stopping criterion used to determine convergence. The Bayesian information criterion is used for model selection and the performance of each model is assessed using the adjusted Rand index. The models are applied to both real and simulated data, obtaining clustering results which are equivalent or superior to those of established clustering methods.

Keywords

Cluster Analysis, Model-based Clustering, Skew-t Distribution, Factor Analysis

URI

http://hdl.handle.net/10214/5274

Collections

Theses & Dissertations (2011 - present)
Theses & Dissertations - Harvested by LAC

Full item page

Mixtures of Skew-t Factor Analyzers

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections