![]() |
|
||||||
Type of Document Master's Thesis Author Caņas, Daniel Alberto, Author's Email Address dacanas@unity.ncsu.edu URN etd-11052004-022839 Title Generalizations and unification of centroid-based clustering methods Degree Master of Science Graduate Program Computer Science Advisory Committee
Advisor Name Title Dr. Robert Funderlic Committee Chair Dr, Jon Doyle Committee Member Dr. Steffen Heber Committee Member Keywords
- k-means
- data mining
- cluster analysis
Date of Defense 2004-11-02 Availability unrestricted Abstract There are many clustering methods that are referred to as k-means-like. We give the minimal necessary and sufficient components for the mechanism of the k-means (iterative and partitional) clustering method of a finite set of objects, X. Thus k-means is generalized and the methods that mimic k-means are unified. We name these k-center clustering methods. The fundamental mechanism of k-center methods exposes the usual misconceptions of k-means such as (a) ``distance" satisfies some of properties of a mathematical metric, (b) there is a need to measure ``distance" between objects in X, and (c) the centers of clusters have the same nature as the objects of X. Moreover, k-center methods have a common formula to choose or calculate centers of clusters. We characterize the convergent common objective function by expressing it in terms of (a) a distance measure for closeness between center objects and the objects in X and (b) the coherence of clusters. We give a three object example to demonstrate the components of the formal mechanism of a k-center method. We then give examples of various known methods that belong to the class of k-center methods. We exhibit an extensive and thorough comparison of the qualitative k-modes and the numerical spherical k-means. Included are paradigm applications, a matrix environment, an understanding of the duality of a dissimilarity and similarity measure, and an understanding of normalized X and the normalized centers of subsets of X.
Files
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access etd.pdf 255.24 Kb 00:01:10 00:00:36 00:00:31 00:00:15 00:00:01