Starting in 1996, Alexa Internet has been donating their crawl data to the Internet Archive. Flowing in every day, these data are added to the Wayback Machine after an embargo period.
There are four new algorithms present in this article. The first one is about speed optimization by using integers as array indexes (I'll explain that later). The second one is about concurrent programming. (Now, with multi-core CPUs becoming mainstream in the near future, you can cut the time to find a lot of combinations by half and more.) The third and fourth ones deal with finding combinations with repetitions.
A total of five combinations are generated:
0
1
2
3
4
Find combinations of 2 from a sequence of 5
The first example is pretty easy, isn't it? Now, find combinations of 2 from a sequence of 5 numbers{0,1,2,3,4}. Please note the red boxes are the combination elements.
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
Oops, you can't shift the last element any more. Now, shift the first element and bring back the last element to its right side. This is shown below.
0
1
2
3
4
For the next two combinations, you continue to shift the last element.
0
1
2
3
4
0
1
2
3
4
Again, you cannot shift any more and shift the first element and bring the last element to its side.
0
1
2
3
4
Shift the last element as usual
0
1
2
3
4
Shift the first element and bring the last to its side.
0
1
2
3
4
Oops, you can shift neither the first nor last element any more. This is the end of all the combinations generated.
A total of 10 combinations are generated:
01
02
03
04
12
13
14
23
24
34
Find combinations of 3 from a sequence of 5
Now, go on to find combinations of 3 from a sequence of 5 numbers{0,1,2,3,4}. Please note the red boxes are the combination elements. But for this, I will not explain. Observe the pattern.
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
0
1
2
3
4
A total of 10 combinations is generated:
012
013
014
023
024
034
123
124
134
234
Steps to Find Combinations
Of course, I can go on to demonstrate finding combinations of 4 elements out of 5 and so on. But I don't. I trust that you can extrapolate the technique for finding different combinations from different sequences.
Let me define the steps for the shifting pattern you used to find all combinations.
Initially, All the element(s) must be on the leftmost of the array.
Shift the last red box until it cannot shift anymore, shift the next rightmost red box (if there is one), and bring back the last element to the rightside of it. This is the new combination.
Continue to shift the last element to the right.
It comes to a point where the last twp elements cannot shift any longer. Then, shift the 3rd one (from the right) and bring the last two to its side consecutively.
The previous point rephrased for all abitrary cases: ('i' here stands for number) It comes to a point where the last i elements cannot shift any longer; then, shift the i+1 element (from right) and bring the last i elements to its side consecutively.
Continue until all the elements cannot shift any longer/ All the combinations have been generated.
Optimised Version: Index Combination
Is there any way you can optimise the technique used in next_combination() function? The answer is yes but you have to give up the genericness of the next_combination().
The next_combination() compares the element that it is going to shift with other elements in the n sequence to know its current position in the n sequence. See if you can get rid of this finding operation.
Take a look at this combination (Take 2 out of 5):
01234
x x
If the two elements in the r sequence are integers that stores its index position in the n sequence, the finding is not needed. And with this method, the combination returned is a combination of the indexes in the n sequence is no longer a combination of objects. The timing of finding combinations from a sequence of any custom objects is the same, O(1), because an integer index is used.
The == operator is not needed to be defined for this method to work; next_combination needs it to be defined unless you used the prediate version. Other requirements remain the same. Anyway, finding combinations of objects is actually the same as finding combinations of integers. And, integer variables should be faster than class objects because of the object overhead. So, you should use this!
This technique is implemented in the CIdxComb class. Here is an example on how to use the CIdxComb class.
Below is a benchmark result of next_combination() and CIdxComb for finding all the combinations of r sequence of 6 from n sequence of 45, 10 times, using QueryPerformanceCounter(), on a Intel P4 2.66Ghz CPU and 1GB of DDR2 RAM PC.
There is a very strange occurrance that CIdxComb runs as slow or slower than next_combination in some benchmarks in debug build. I don't know why, but I think this difference is due to how iterators (which next_combination() is used) and array subscript index (which CIdxComb is used) are handled in std::vector. The above result is from a release build without any optimization.
About the Author
I am currently working as a software developer in a company specialized in 3D building visualization. I am extremely interested in optimizing techniques like CPU SIMD instructions like the Intel SSE2, multi-threading techinques on multi-core/SMP processors and GPGPU languages like Brook+/CAL for ATI GPUs and nVidia's CUDA.
Like many Singaporeans, my hobbies include reading, singing karaoke, watching animes and movies, play computer games and jogging.
I wish I have more time to write articles for CodeGuru since I have a few ideas(long overdue) to write about. And I always explain the working behind the code in my articles. Hope you like my articles on CodeGuru!
Add www.codeguru.com to your favorites Add www.codeguru.com to your browser search box IE 7 | Firefox 2.0 | Firefox 1.5.xReceive news via our XML/RSS feed
RATE THIS ARTICLE:
Excellent Very Good Average Below Average Poor
(You must be signed in to rank an article. Not a member? Click here to register)