PANDORE Version 6 |
GREYC-IMAGE |
pgaussclassification
Performs gauss clustering on a set of objects.
Synopsis
pgaussclassification attr_base attr_in attr_out [col_base|-] [col_in|-] [col_out|-]
Description
pgaussclassification is a partitioning method for a group of n objects
into k clusters.
The basic idea supposes that the class distribution has a gaussian distribution,
and for each object x to be classified the principle is to find the class that has
the maximum probability to contain x.
Practically, pgaussclassification finds the class i that
minimizes:
f(x,i) = ln(det A(i)) + t(x - m(i)).A(i)-1.(x - m(i)) - ln(P(i)2)
- where x is the feature vector for the object x;
- A(i) is the covariance matrix for the class i ;
- m(i) is mean vector the for class i ;
- P(i) is the a priori probability to find class i.
These values can be calculated from the operator parraycovarmat.
Parameters
- attr_base is the base name for the gaussian features.
If there exists n clusters and p features:
- attr_base.mean is an array of n*p values which contains
at the index [i*n+j] the mean of the j+1th feature of
the i-1 cluster.
- attr_base.det is an array of n reals which contains at the index
[i-1] the determinant det(A(i)).
- attr_base.inv is an array of p*p values which
contains at the index [k*p*p + i*p +j] the value of k-1th matrix
cell of the A-1[i,j].
(These 3 attributes can be calculted thanks to the operator
parraycovarmat.)
attr_base.pap is an array of n reals which contains
a priori probabilities of each cluster. (This array can be omitted;
in this case probabilities are supposed equiprobable).
- attr_in is the base name of the feature vector of the objects
to be classified. The vectors
are named attr_in.1, attr_in.2, ..., attr_in.n in the input collection.
The item j of the array attr_in.i contains the (i)th
feature of the (j+1)th object. They are Double arrays.
- attr_out is the name of the output array. Each item
i of the array contains the cluster index from which the
(i)th object is assigned. attr_out is an array of
unsigned longs where attr_out[i] specifies the cluster index for
the object i.
Inputs
- col_base: a collection which contains the learned parameters.
- col_in: a collection which contains the objects to be classified.
Outputs
- col_out: a collection which contains classified objects.
Result
Returns SUCCESS or FAILURE.
Examples
Classifies beans into the jellybean.pan image from sample
of each bean stored in the directory 'base' (Unix version).
# Learning
classes=1
for i in base/*.pan
do
pim2array ind $i /tmp/tmp1
parray2array ind.1 Float /tmp/tmp1| parray2array ind.2 Float | parray2array ind.3 Float - a.pan
parraycovarmat ind ind a.pan i-01.pan
if [ -f base.pan ]
then pcolcatenateitem i-01.pan base.pan base.pan
else cp i-01.pan base.pan
fi
classe=`expr $classe + 1`
done
rm /tmp/tmp1
# Classification
pim2array ind jellybeans.pan a.pan
parray2array ind.1 Float a.pan| parray2array ind.2 Float | parray2array ind.3 Float - b.pan
pgaussclassification ind ind ind base.pan b.pan | parray2im $ncol $nrow 0 ind | pim2rg - out.pan
See also
Classification
C++ prototype
Errc PGaussClassification(const std::string &a_base, const Collection &c_base,
const std::string &a_in, const Collection &c_in,
const std::string &a_out, Collection &c_out);
Version française
Classification utilisant un modèle gaussien.
Author: Alexandre Duret-Lutz