In search of more stable hierarchical trees: devising new algorithms to improve upon the stability of neighbor joining
The standard method of constructing hierarchical trees – neighbour joining – while commonly used, has some major flaws. For most data sets not derived from a common descent process, it produces trees that are highly unstable. Adding or deleting a point can cause dramatic shifts in the tree topology. The degree of this shift can be measured by calculating the distance between trees. To provide an alternative to neighbour joining, bubble clustering was devised. Instead of using the simple Euclidean distance between points, the connections between points are established by repeated sampling. A multi-dimensional sphere is placed into the data space and the association between all points within the sphere is increased. Once this is complete the resultant matrix is converted into a tree by sequentially joining the most associated points. The new algorithm proved to generate more stable trees compared to neighbour joining when applied to random or real data.