PANDORE Version 6 |
GREYC-IMAGE |
pknn
Performs K-Nearest Neighbors Clustering on a set of objects.
Synopsis
pknn attr_base attr_in attr_out k [col_base|-] [col_in|-] [col_out|-]
Description
pknn is a partitioning method for a group of n objects
into k clusters.
The classifier works based on minimum distance from the query instance
to the training samples to determine the K-nearest neighbors.
After we gather K nearest neighbors, we take simple majority of these
K-nearest neighbors to be the prediction of the query instance.
The distance measure between two objects xi and xj
uses the euclidean distance:
Dij = [ SUM{d=1;n} (xid - xjd)2 ]1/2
where xid is the feature d for the object i and xjd is the feature d for
the object j.
Parameters
- attr_base is the base name of the feature vector
of the classified objects.
The vectors
are named attr_base.1, attr_base.2,..., attr_base.n in the input collection.
The item j of the array attr_in.i contains the (i)th
feature of the (j+1)th object. They are Double arrays.
If the array attr_base.C is present then it contains
the cluster number of each objects. Otherwise the ith object
falls into the cluster i.
- attr_in is the base name of the feature vector of the objects
to be classified. The vectors
are named attr_in.1, attr_in.2, ..., attr_in.n in the input collection.
The item j of the array attr_in.i contains the (i)th
feature of the (j+1)th object. They are Double arrays.
- attr_out is the name of the output array. Each item
i of the array contains the number of the cluster from which the
(i)th object is assigned. attr_out is an array of
unsigned longs where attr_out[i] specifies the cluster number for
the object i.
- k is the number of desired cluster.
Inputs
- col_base: a collection which contains the feature vector of the classified objects.
- col_in: a collection which contains the feature vector of the objects to be classified.
Outputs
Result
Returns SUCCESS or FAILURE.
Examples
Classifies beans into the jellybean.pan image from sample
of each bean stored in the directory 'base' (Unix version).
# Learning
classes=1;
for i in base/*.pan
do
pim2array ind $i /tmp/tmp1
parraysize ind.1 /tmp/tmp1
size=`pstatus`
pcreatearray ind.C Ushort $size $classes | pcolcatenateitem /tmp/tmp1 - i-01.pan
if [ -f base.pan ]
then pcolcatenateitem i-01.pan base.pan base.pan
else cp i-01.pan base.pan
fi
classes=`expr $classes + 1`
done
# Classification
pproperty 0 jellybeans.pan
ncol=`pstatus`
pproperty 1 jellybeans.pan
nrow=`pstatus`
pim2array ind jellybeans.pan | pknn ind ind ind 10 base.pan - | parray2im $ncol $nrow 0 ind | pim2rg - out.pan
See also
Classification
C++ prototype
Errc PKnn(const std::string &a_base, const Collection &c_base,
const std::string &a_in, const Collection &c_in,
const std::string &a_out, Collection &c_out,
int K);
Version française
Classification selon les k plus proches voisins.
Author: Alexandre Duret-Lutz