A comprehensive cluster validity framework for clustering algorithms
Cluster validity is a critical component of cluster analysis due to the unsupervised nature of cluster analysis, the availability of various clustering algorithms, and the availability of many cluster validity measures. This thesis introduces and evaluates a new comprehensive framework to address the cluster validity problem, the Cluster Validity Framework (CVF). This framework contains a methodology that revolves around three key areas: domain knowledge, data quality, and characteristics of clustering algorithms. The utility of CVF is demonstrated through several experiments performed on both artificial data as well as real data sets from several domains to examine the various cluster validity scenarios defined under the cluster validity framework. Two popular clustering algorithms, K-means and Fuzzy c-means, are used in these experiments to generate clustering schemes.