From array genotypes to whole-genome sequences: harnessing genetic variation for dairy cattle selection
Genetic markers are currently used as a tool in animal breeding to measure and make use of genetic variation. After the unsuccessful implementation of marker-assisted selection including microsatellites in genetic evaluation models, implementation of genomic selection principles in breeding programs allowed drastic acceleration of the genetic gains made in the dairy industry. Although great genetic improvements have been made possible with the introgression of array-based single nucleotide polymorphisms (SNP) genotypes into genetic evaluation models, more variants are needed to explain a larger proportion of the genetic variance observed in economically important traits. This would allow for more precise estimation of breeding values (EBV) and greater genetic progress. The increase in the number and type of variants through using whole-genome sequencing (WGS), for instance copy number variants (CNV), in genetic evaluation models could contribute to higher accuracies of EBV. Sequencing of a large number of animals is still prohibitively expensive, but the large number of genotyped samples already available allows for 1) the accurate imputation of genotypes to WGS variants, 2) the identification of CNV relying on the signal intensity values produced at the time of array genotyping, and 3) the use of the CNV identified with high confidence in silico to gain knowledge about the genetic architecture of traits of economic importance, for example hoof health traits. In this thesis, haplotype-based methods were developed that improve reference population animal selection for sequencing, to allow for more accurate imputation of common or rare variants. Secondly, CNV were identified with high confidence using both array genotypes and WGS information. Finally, CNV regions were identified that were associated with hoof health traits recorded for the Canadian Holstein genetic evaluation. Starting with SNP that were phased to haplotypes and looking at the structural variants that are CNV, this thesis bridges current and possible future genetic markers to exploit the maximum genomic information present in the dairy population. Altogether, the advances made in this thesis will permit an increase in the rate of genetic improvement for dairy cattle once breeding value estimation models have been developed that efficiently include and combine CNV and SNP information.