by R. Grothmann In this notebook, we use a simple algorithm proposed by Lloyd (later used by Steinhaus, MacQueen) to cluster points.
>load clustering.e
Functions for clustering data.
See: ../../reference/clustering To generate some nice data, we take three points in the plane.
>m=3*normal(3,2);
Now we spread 100 points randomly around these points.
>x=m[intrandom(1,100,3)]+normal(100,2);
We want three clusters.
>k=3;
The function kmeanscluster contains the algorithm. It returns the indices of the clusters the points contain to.
>j=kmeanscluster(x,k);
We plot each point with a color representing its cluster.
>P=x'; plot2d(P[1],P[2],color=10+j,points=1); ...
We add the orginal cluster centers to the plot.
>loop 1 to k; plot2d(m[#,1],m[#,2],points=1,style="o#",add=1); end; ... insimg;
The same in 3D.
>k=3; m=3*normal(k,3); ... x=m[intrandom(1,1000,k)]+normal(1000,3); ... j=kmeanscluster(x,k); ... P=x'; plot3d(P[1],P[2],P[3],color=10+j,>points,style=".."); ... insimg();
The file clustring.e contains another clustering algorithm. This algorithm uses a clustering of the entries of the eigenvector to the largest eigenvalue of a similarity matrix. The similarity matrix is derived from the distance matrix here.
>k=2; m=normal(2,2); ... x=m[intrandom(1,100,k)]+normal(100,2); ... j=eigencluster(x,k); ... P=x'; plot2d(P[1],P[2],color=10+j,points=3):
For comparison, the k-means clsutering.
>j=kmeanscluster(x,k); ... P=x'; plot2d(P[1],P[2],color=10+j,points=1):