
Question:
I am trying to write a bag of features system image recognition system. One step in the algorithm is to take a larger number of small image patches (say 7x7 or 11x11 pixels) and try to cluster them into groups that look similar. I get my patches from an image, turn them into gray-scale floating point image patches, and then try to get cvKMeans2 to cluster them for me. I think I am having problems formatting the input data such that KMeans2 returns coherent results. I have used KMeans for 2D and 3D clustering before but 49D clustering seems to be a different beast.
I keep getting garbage values for the returned clusters vector, so obviously this is a garbage in / garbage out type problem. Additionally the algorithm runs way faster than I think it should for such a huge data set.
In the code below the straight memcpy is only my latest attempt at getting the input data in the correct format, I spent a while using the built in OpenCV functions, but this is difficult when your base type is CV_32FC(49).
Can OpenCV 1.1's KMeans algorithm support this sort of high dimensional analysis?
Does someone know the correct method of copying from images to the K-Means input matrix?
Can someone point me to a free, Non-GPL KMeans algorithm I can use instead?
This isn't the best code as I am just trying to get things to work right now:
std::vector<int> DoKMeans(std::vector<IplImage *>& chunks){ // the size of one image patch, CELL_SIZE = 7 int chunk_size = CELL_SIZE*CELL_SIZE*sizeof(float); // create the input data, CV_32FC(49) is 7x7 float object (I think) CvMat* data = cvCreateMat(chunks.size(),1,CV_32FC(49) ); // Create a temporary vector to hold our data // we'll copy into the matrix for KMeans int rdsize = chunks.size()*CELL_SIZE*CELL_SIZE; float * rawdata = new float[rdsize]; // Go through each image chunk and copy the // pixel values into the raw data array. vector<IplImage*>::iterator iter; int k = 0; for( iter = chunks.begin(); iter != chunks.end(); ++iter ) { for( int i =0; i < CELL_SIZE; i++) { for( int j=0; j < CELL_SIZE; j++) { CvScalar val; val = cvGet2D(*iter,i,j); rawdata[k] = (float)val.val[0]; k++; } } } // Copy the data into the CvMat for KMeans // I have tried various methods, but this is just the latest. memcpy( data->data.ptr,rawdata,rdsize*sizeof(float)); // Create the output array CvMat* results = cvCreateMat(chunks.size(),1,CV_32SC1); // Do KMeans int r = cvKMeans2(data, 128,results, cvTermCriteria(CV_TERMCRIT_EPS+CV_TERMCRIT_ITER, 1000, 0.1)); // Copy the grouping information to our output vector vector<int> retVal; for( int y = 0; y < chunks.size(); y++ ) { CvScalar cvs = cvGet1D(results, y); int g = (int)cvs.val[0]; retVal.push_back(g); } return retVal;}
Thanks in advance!
Solution:1
Though I'm not familiar with "bag of features", have you considered using feature points like corner detectors and SIFT?
Solution:2
You might like to check out http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/ for another open source clustering package.
Using memcpy like this seems suspect, because when you do:
int rdsize = chunks.size()*CELL_SIZE*CELL_SIZE;
If CELL_SIZE and chunks.size() are very large you are creating something large in rdsize. If this is bigger than the largest storable integer you may have a problem.
Are you wanting to change "chunks" in this function? I'm guessing that you don't as this is a K-means problem.
So try passing by reference to const here. (And generally speaking this is what you will want to be doing)
so instead of:
std::vector<int> DoKMeans(std::vector<IplImage *>& chunks)
it would be:
std::vector<int> DoKMeans(const std::vector<IplImage *>& chunks)
Also in this case it is better to use static_cast than the old c style casts. (for example static_cast(variable) as opposed to (float)variable ).
Also you may want to delete "rawdata":
float * rawdata = new float[rdsize];
can be deleted with:
delete[] rawdata;
otherwise you may be leaking memory here.
Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
EmoticonEmoticon