Main content

Assessing Errors in DNA Barcode Sequence Records

Show simple item record

dc.contributor.advisor McNicholas, Paul
dc.contributor.advisor Hanner, Robert
dc.contributor.author Athey, Taryn
dc.date.accessioned 2013-10-17T13:04:28Z
dc.date.available 2013-10-17T13:04:28Z
dc.date.copyright 2013-10
dc.date.created 2013-09-23
dc.date.issued 2013-10-17
dc.identifier.uri http://hdl.handle.net/10214/7588
dc.description.abstract DNA barcoding uses a standardized gene region to identify species. In animals, the barcode is a 648bp region of the cytochrome c oxidase I gene. These sequences are uploaded to the Barcode of Life Data System (BOLD), a reference library, which requires accuracy of its data. This thesis uses a cross-taxa study to assess the use of a frequency matrix approach to identify very low frequency variants (VLFs), which represent potential errors within BOLD. In each group analyzed, most VLFs occurred in the first and last 50bp of the sequences, consistent with known error properties of Sanger sequencing. To correct for this, success rates of different classification methods on full length and reduced barcodes were assessed. Neither reduction of the barcode, nor number of VLFs affected the success rate of classification. This indicates that trimming barcodes by 50bp could reduce the overall error in BOLD without affecting species identification. en_US
dc.language.iso en en_US
dc.subject DNA Barcoding en_US
dc.subject Frequency Matrix en_US
dc.subject Very Low Frequency Variant en_US
dc.subject VLF en_US
dc.subject Bioinformatics en_US
dc.subject Classification en_US
dc.subject Barcoding with Logic en_US
dc.title Assessing Errors in DNA Barcode Sequence Records en_US
dc.type Thesis en_US
dc.degree.programme Bioinformatics en_US
dc.degree.name Master of Science en_US
dc.degree.department Department of Mathematics and Statistics en_US


Files in this item

Files Size Format View
Athey_Taryn_201310_MSc.pdf 4.416Mb PDF View/Open

This item appears in the following Collection(s)

Show simple item record