Woven String Kernels

McEachern, Andrew
Journal Title
Journal ISSN
Volume Title
University of Guelph

Woven string kernels are a form of evolvable, directed, acyclic graphs specialized to perform DNA classification. They are introduced in this thesis, given a rigorous theoretical treatment as a mathematical object, and shown to have a number of interesting properties. Two forms of woven string kernels, uniform and non-uniform, are discussed. The non-uniform woven string kernels are repurposed for use as updating rules for cellular automata. The details of their representation and implementation are presented. A chapter of this thesis is devoted to a visualization technique called non-linear projection, an evolvable form of multidimensional scaling that is used in the analysis of experimental results. The woven string kernels are tested on simple and complex synthetic data as well as biological data, using an evolutionary algorithm to find woven string kernels that are acceptable solutions for classification. They perform marginally on the simplest synthetic data - based on GC content - for which they are not entirely appropriate. They exhibit perfect classification on the more complex synthetic data and on the biological data. Woven string kernels have a number of parameters including their height, the number of initial strings from which they are built, and the amount of weaving used to generate the final structure. A parameter study shows that these parameters must be set based on the type of data under analysis. Experimentation with woven string kernels as rules for updating cellular automata show that having a larger population and more available colour states are correlated with an increase in performance as apoptotic one dimensional cellular automata. This thesis concludes with directions for future work related to theory and experimentation, for both uniform and non-uniform woven string kernels.

Biomathematics, Evolutionary Computation