Tutorial :OpenCV 1.1 K-Means Clustering in High Dimensional Spaces



Question:

I am trying to write a bag of features system image recognition system. One step in the algorithm is to take a larger number of small image patches (say 7x7 or 11x11 pixels) and try to cluster them into groups that look similar. I get my patches from an image, turn them into gray-scale floating point image patches, and then try to get cvKMeans2 to cluster them for me. I think I am having problems formatting the input data such that KMeans2 returns coherent results. I have used KMeans for 2D and 3D clustering before but 49D clustering seems to be a different beast.

I keep getting garbage values for the returned clusters vector, so obviously this is a garbage in / garbage out type problem. Additionally the algorithm runs way faster than I think it should for such a huge data set.

In the code below the straight memcpy is only my latest attempt at getting the input data in the correct format, I spent a while using the built in OpenCV functions, but this is difficult when your base type is CV_32FC(49).

Can OpenCV 1.1's KMeans algorithm support this sort of high dimensional analysis?

Does someone know the correct method of copying from images to the K-Means input matrix?

Can someone point me to a free, Non-GPL KMeans algorithm I can use instead?

This isn't the best code as I am just trying to get things to work right now:

    std::vector<int> DoKMeans(std::vector<IplImage *>& chunks){   // the size of one image patch, CELL_SIZE = 7   int chunk_size = CELL_SIZE*CELL_SIZE*sizeof(float);   // create the input data, CV_32FC(49) is 7x7 float object (I think)   CvMat* data = cvCreateMat(chunks.size(),1,CV_32FC(49) );       // Create a temporary vector to hold our data   // we'll copy into the matrix for KMeans   int rdsize = chunks.size()*CELL_SIZE*CELL_SIZE;   float * rawdata = new float[rdsize];     // Go through each image chunk and copy the    // pixel values into the raw data array.   vector<IplImage*>::iterator iter;   int k = 0;   for( iter = chunks.begin(); iter != chunks.end(); ++iter )   {      for( int i =0; i < CELL_SIZE; i++)    {     for( int j=0; j < CELL_SIZE; j++)     {      CvScalar val;      val = cvGet2D(*iter,i,j);      rawdata[k] = (float)val.val[0];      k++;     }      }   }     // Copy the data into the CvMat for KMeans   // I have tried various methods, but this is just the latest.   memcpy( data->data.ptr,rawdata,rdsize*sizeof(float));     // Create the output array   CvMat* results = cvCreateMat(chunks.size(),1,CV_32SC1);     // Do KMeans   int r = cvKMeans2(data, 128,results, cvTermCriteria(CV_TERMCRIT_EPS+CV_TERMCRIT_ITER, 1000, 0.1));     // Copy the grouping information to our output vector   vector<int> retVal;   for( int y = 0; y < chunks.size(); y++ )   {    CvScalar cvs = cvGet1D(results, y);    int g =  (int)cvs.val[0];    retVal.push_back(g);   }     return retVal;}  

Thanks in advance!


Solution:1

Though I'm not familiar with "bag of features", have you considered using feature points like corner detectors and SIFT?


Solution:2

You might like to check out http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/ for another open source clustering package.

Using memcpy like this seems suspect, because when you do:

 int rdsize = chunks.size()*CELL_SIZE*CELL_SIZE;  

If CELL_SIZE and chunks.size() are very large you are creating something large in rdsize. If this is bigger than the largest storable integer you may have a problem.

Are you wanting to change "chunks" in this function? I'm guessing that you don't as this is a K-means problem.

So try passing by reference to const here. (And generally speaking this is what you will want to be doing)

so instead of:

std::vector<int> DoKMeans(std::vector<IplImage *>& chunks)  

it would be:

std::vector<int> DoKMeans(const std::vector<IplImage *>& chunks)  

Also in this case it is better to use static_cast than the old c style casts. (for example static_cast(variable) as opposed to (float)variable ).

Also you may want to delete "rawdata":

 float * rawdata = new float[rdsize];  

can be deleted with:

delete[] rawdata;  

otherwise you may be leaking memory here.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »