PANDORE Version 6 |
GREYC-IMAGE |
pkmeans
Performs K-means clustering on a set of objects.
Synopsis
pkmeans attr_in attr_out k maxiter [col_in|-] [col_out|-]
Description
pkmeans classifies a given set of objects into K clusters
from their features. The object features are specified into col_in
as a set of vectors attr_in.1, attr_in.2, ..., attr_in.n.
K-means is a partitioning method for a group of n objects
into k clusters which uses the following steps:
- Place k points into the space represented by the objects that are being clustered.
These points represent initial group centroids.
- Assign each object to the group that has the closest centroid.
- When all objects have been assigned, recalculate the positions of the K centroids.
- Repeat steps 2 and 3 until the centroids no longer move. This produces a separation of
the objects into groups from which the distance to be minimized can be calculated.
The distance measure between an object i and the cluster center Cj uses the
euclidean distance:
Dij = [ SUM{d=1;n} (xid - Cjd )2 ] 1/2
where xid is the feature d for the object i and cjd is the feature d for
the centroid Cj.
Parameters
- attr_in is the base name of the feature vector. The vectors
are named attr_in.1, attr_in.2, ..., attr_in.n in the input collection.
The item j of the array attr_in.i contains the (i)th
feature of the (j+1)th object. They are Double arrays.
- attr_out is the name of the output array. Each item
i of the array contains the number of the cluster from which the
(i)th object is assigned. attr_out is an array of
unsigned longs where attr_out[i] specifies the cluster number for
the object i.
- k is the number of desired cluster.
- maxiter is the maximum number of iteration (in case of divergence).
Inputs
- col_in: a collection which contains the object features.
Outputs
- col_out: a collection which contains the assignment vector (object -> cluster).
Result
Returns SUCCESS or FAILURE.
Examples
Segments the tangram.pan image thanks to
a K-means clustering of the pixels based
on mean and variance features:
pmeanfiltering 1 tangram.pan moy.pan
pvariancefiltering 0 255 tangram.pan var.pan
pim2array data.1 moy.pan data1.colc
pim2array data.2 var.pan data2.colc
parray2array data.1 Float data1.colc data1.cold
parray2array data.2 Float data2.colc data2.cold
pcolcatenateitem data1.cold data2.cold data3.cold
parraysnorm data data3.cold data3.cold
pkmeans data attrib 5 100 data3.cold cluster.cold
pproperty 0 tangram.pan
w=`pstatus`
pproperty 1 tangram.pan
h=`pstatus`
parray2im $h $w 0 attrib cluster.Cold kmeans.pan
pim2rg kmeans.pan classif1_out.pan
See also
Classification
C++ prototype
Errc PKmeans( const std::string &a_in, const Collection &c_in,
const std::string &a_out, Collection &c_out,
int k, int max );
Version française
Classification automatique selon les K-moyennes.
Author: Alexandre Duret-Lutz