Main content

Using Anchor Clustering to Identify Associations between Codon Bias and Gene Attributes within the Human Genome

Show full item record

Title: Using Anchor Clustering to Identify Associations between Codon Bias and Gene Attributes within the Human Genome
Author: Stoodley, Matthew Alexander
Department: Department of Integrative Biology
Program: Bioinformatics
Advisor: Ashlock, DanielGraether, Steffen
Abstract: Codon bias describes the tendency to use certain synonymous codons to encode amino acids. It is well established that codon bias varies between different organisms and plays a role in gene expression and co-translational folding. It is important to understand codon bias because a better understanding of gene expression and translation mechanics may allow for more efficient recombinant protein production, and could ultimately improve the ability to create synthetic genes. Human genes were investigated to elucidate the connection between their codon bias and the subsequent impact on structure, function, and tissue specific expression levels. Analysis was performed by representing human genes according to their codon bias, then clustering genes together that have a similar codon bias. Gene clusters were studied to see if genes that use similar codons are statistically more likely to share other properties. Clustering was performed using a novel data driven approach to a simple clustering algorithm called anchor clustering. Anchor clustering was used because it is fast and deterministic; two qualities that other approaches can struggle with when clustering data in high dimensional spaces. To study the connection between gene product structure and codon bias, clusters were analysed according to their likelihood to contain intrinsically disordered proteins. Because structure and function are so closely related, clusters were also analysed for GO term overrepresentation. Last, clusters were examined through the lens of tissue specific gene expression by incorporating expression information at the mRNA and protein levels. The analyses revealed an association between codon usage and the propensity of a gene product to be intrinsically disordered, while the functional analyses revealed that codon bias is associated with cell cycle regulation and cell type differentiation. Expression analysis revealed that in humans there may be a codon bias associated with highly expressed genes indiscriminate of tissue, as well as tissue specific codon biases in the cortex, testis, and liver. Some of the tissue specific findings have been found by other groups, but this investigation distinguishes between an organism-wide codon bias associated with high expression and particular codon biases associated with high expression in individual tissues. In addition, this work builds on the current knowledge of codon bias, determining if these findings previously only evaluated using mRNA levels also appear at the protein concentration level. The results suggest that codon harmonization can be improved further by seeking to replicate the tissue codon bias in which a gene could be highly expressed.
Date: 2022-11
Terms of Use: All items in the Atrium are protected by copyright with all rights reserved unless otherwise indicated.
Related Publications: Stoodley, M., Ashlock, D., & Graether, S. (2018). Data driven point packing for fast clustering. In 2018 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB) (pp. 1-8). IEEE. DOI: 10.1109/CIBCB.2018.8404974

Files in this item

Files Size Format View
Stoodley_Matthew_202211_Phd.pdf 5.983Mb PDF View/Open

This item appears in the following Collection(s)

Show full item record

The library is committed to ensuring that members of our user community with disabilities have equal access to our services and resources and that their dignity and independence is always respected. If you encounter a barrier and/or need an alternate format, please fill out our Library Print and Multimedia Alternate-Format Request Form. Contact us if you’d like to provide feedback:  (email address)