I am trying to create an invert index structure in C++. An invert index, in this case, refers to image retrieval system, where each invert index refers to a numbers of "Patch ID". (simply called "ID" in the code) These IDs are basically for describing many kind of pixel gradient patches from all images in the database.
The invert index is a map container and will be very sparse and each index of the invert index will contain a sub-map. A sub-map will have an "Image ID" and frequency counting integer. For example, if we have 2 images with "Image ID" 10 and 20, and both of these images have "Patch ID" 999, appear once in "Image ID" 10 and appear twice in "Image ID" 20, we will have the following:
invertID[999] == sub-mapwith <10,1> and <20,2>
I've written the code below for this, but somehow I am not quite satisfied.
One instance, I have to put back the sub-map into the invert ID's index as you can see below.
Another thing is I am not quite sure whether I am using the correct type of container for the correct job or not. I have about 5000 images (so that's 5000 "Image ID") and I see that "Patch ID" could vary from 1 to numbers as high as 40 billion.
In this case, performance-wise and design-wise, should I be using map or should I go for some other container? Suggestions are welcome.
Note: this will be run in a server that doesn't have a C++11 compiler.
unsigned long long int ID, index = 0;
map<unsigned long long int, map<int, int>> invertID;
...
...
//for the sake of convenient, two for loops below are shown as pseudocode.
for each image{ //loop with imageID, incrementing variable imageID in each iteration
//for each gradient patches in the image
for each gradient patch of this image{ //loop with index, incrementing variable index in each iteration.
//processID encapsulates the procedure to generate an ID for each interesting point in the image. We basically read values in the respective file and do some calculation.
ID = processID(index);
//if this ID exists in InvertID,
map<unsigned long long int, map<int,int>>::iterator mainMapIt = invertID.find(ID);
if (mainMapIt != invertID.end()){
//if this ImageID key exists in InvertID sub-map,
map<int,int> M = mainMapIt->second;
map<int,int>::iterator subMapIt = M.find(imageID);
if (subMapIt != M.end()){
//the index already exists, so increment the number (frequency) of this ImageID key
++M[imageID];
}
else{
//the index does not exist, so add ImageID key with value 1 into the InvertID
M[imageID] = 1;
}
//get this sub-Map back to the invert ID's index.
invertID[ID] = M;
}
else{
//the ID does not exist in the InvertId, yet, so
//create the first empty map with the key as image ID with value 1 and put it into the invertID.
map<int,int> M;
M[imageID] = 1;
invertID[ID] = M;
}
}
}