Compiler is g++ 4.2.
Yikes. That compiler is ancient. I’ve met competitive programmers younger than that compiler. If you’re planning to do any serious C++ programming, you really should look into upgrading your toolchain, if possible.
#define BIG unsigned long long int
This is absolutely the wrong way to create a new type.
First, there is a trend in C++ to avoid the preprocessor completely. So, no #define
s, no #if
s, and so on. It is ALMOST completely unnecessary in the most recent version (C++20), but even in C++11 it is almost completely avoidable.
Rather than a #define
, you should use a type alias:
using BIG = unsigned long long int;
But there’s one more problem. By convention, ALL_CAPS
variable names are reserved for preprocessor macros… which basically means you should never use them. That means you should do:
using big = unsigned long long int;
Or, even better, choose a more descriptive name, like:
using big_uint = unsigned long long int;
But even better still would be to create your own namespace, and put everything—this type alias and all the Fibonacci functions—in there:
namespace your_ns {
using big_t = unsigned long long int;
int fib(int n) {
// ... etc. ...
} // your_ns
Now I know your focus is on the memoized version of the Fibonacci algorithm, but the non-memoized one is really problematic:
int fib(int n) {
if (n < 2) {
return n;
}
return fib(n-1) + fib(n-2);
}
Imagine what happens if you call fib(5)
. Internally, that results in two calls to fib(4)
and fib(3)
:
fib(5)
↓
fib(4) + fib(3)
Except… fib(4)
also degrades to fib(3)
and fib(2)
, while fib(3)
degrades to fib(2)
and fib(1)
:
fib(5)
↓
fib(4) + fib(3)*
↓
fib(3)* + fib(2) + fib(2) + fib(1)
As you can see, fib(3)
is called twice. And it gets even worse, because fib(3)
degrades to fib(2)
and fib(1)
:
fib(5)
↓
fib(4) + fib(3)
↓
fib(3) + fib(2)* + fib(2)* + fib(1)
↓
fib(2)* + fib(1) + fib(1) + fib(0) + fib(1) + fib(0) + 1
Now you can see fib(2)
is called three times. Each one of those times becomes fib(1)
and fib(0)
.
fib(5)
↓
fib(4) + fib(3)
↓
fib(3) + fib(2) + fib(2) + fib(1)
↓
fib(2) + fib(1) + fib(1) + fib(0) + fib(1) + fib(0) + 1
↓
fib(1) + fib(0) + 1 + 1 + 1 + 1 + 1 + 1
So the net result of a single call to fib(5)
is:
- 1 call to
fib(4)
- 2 calls to
fib(3)
- 3 calls to
fib(2)
- 5 calls to
fib(1)
- 3 calls to
fib(0)
And that’s just fib(5)
. Imagine how many repeated calls you’ll get with fib(50)
.
(This is obviously not such a big problem with the memoized version… but it’s still a problem!)
What you really want to look into is the tail-recursive Fibonacci algorithm. I’m not going to work it out for you, but I’ll give you the interface:
auto fib_impl(int n, big_t a, big_t b)
{
// ...
}
auto fib(int n)
{
// are you sure you don't want to check for negative numbers?
if (n < 2)
return big_t(n); // note: this line won't compile if you used BIG; this is why the preprocessor sucks
return fib_impl(n, 0, 1);
}
Now you don’t say what version of C++ you’re working with, but if it’s one of the more recent versions, then you can make all of this constexpr
. That means that, when possible, the function may be computed at compile time, meaning it will have (effectively) zero run-time cost.
BIG fib_with_memo(int n) {
if (umap.find(n) == umap.end()) { // if F_n not in cache, calculate, cache, and return it
BIG val = fib_with_memo(n-1) + fib_with_memo(n-2);
umap[n] = val;
return val;
}
return umap[n]; // otherwise return the cached value
}
Now, since umap
has no purpose outside of this function, it makes more sense to hide it within the function as a static variable. If you don’t know what function static variables are, they’re basically global variables, but only visible within the function.
While you’re at it, you might as well replace:
umap[0] = 0;
umap[1] = 1;
with:
umap = std::unordered_map<int, big_t>{{0, 0}, {1, 1}};
Why? Because this will be MUCH more efficient. The first way means that each line first searches the map for the key, then creates the new pair. The second way doesn’t bother with the search, but rather just creates the two pairs directly.
So now your fib_with_memo()
could look like this:
auto fib_with_memo(int n) {
// umap is constructed only once, the first time the function is called,
// and initialized with {0, 0} and {1, 1}.
static auto umap = std::unordered_map<int, big_t>{{0, 0}, {1, 1}};
// note everything else is exactly the same (except I replaced the BIG
// with auto)
if (umap.find(n) == umap.end()) { // if F_n not in cache, calculate, cache, and return it
auto val = fib_with_memo(n-1) + fib_with_memo(n-2);
umap[n] = val;
return val;
}
return umap[n]; // otherwise return the cached value
}
Another small fix is that once you’re in that if
block, you know for a fact that n
isn’t in the map. So rather than doing umap[n]
, you can emplace the new pair directly:
if (umap.find(n) == umap.end()) {
auto val = fib_with_memo(n-1) + fib_with_memo(n-2);
umap.emplace(n, val);
return val;
}
Depending on your implementation, that might be more efficient.
Another possibly major optimization comes from the fact that you use find()
… and then just discard the result. Why not use it?
if (auto p = umap.find(n); p == umap.end())
{
// we didn't find n
auto val = fib_with_memo(n-1) + fib_with_memo(n-2);
umap.emplace(n, val);
return val;
}
else
{
// we found n
return p->second;
}
So this is about where we’re at right now:
auto fib_with_memo(int n) {
static auto umap = std::unordered_map<int, big_t>{{0, 0}, {1, 1}};
if (auto p = umap.find(n); p == umap.end())
auto val = fib_with_memo(n-1) + fib_with_memo(n-2);
umap.emplace(n, val);
return val;
}
else
{
// we found n
return p->second;
}
}
Now we get to the real problem: the same issue as with the non-memoized version, where you’re calling the same function over and over and over. The fix is the same as with the non-memoized version: refactor:
auto fib_with_memo_impl(int n, big_t a, big_t b) {
static auto umap = std::unordered_map<int, big_t>{{0, 0}, {1, 1}};
if (auto p = umap.find(n); p == umap.end())
auto val = // actual algorithm to calculate fib here
umap.emplace(n, val);
return val;
}
else
{
// we found n
return p->second;
}
}
auto fib_with_memo(int n) {
// are you sure you don't want to check for n < 0?
// trivial solutions, don't even bother diving into the calculation
if (n < 2)
return big_t(n);
return fib_with_memo_impl(n, 0, 1);
}
It’s up to you to implement the actual Fibonacci calculation there (you have a
and b
to work with). But otherwise, that should be a drop-in replacement for your current design, except orders of magnitude faster.
An alternative design
As you said:
I used a std::unordered_map
(hash table) because its lookup/insertion times are \$O(1)\$ (compared to the std::map
, which has \$O(log(n))\$ for both.)
Unfortunately, this thinking is misguided.
It turns out that due to the way modern processors work—particularly with caching—that big-O notation can be WILDLY misleading. I recall that the inventor of C++ himself was floored when someone showed him that a std::list
, which has \$O(1)\$ insertion in the middle, is slower than a std::vector
, which has \$O(n)\$ insertion in the middle, even for huge numbers of elements.
Forget all that theory crap, and instead take a step back and consider your actual problem. Imagine that you’ve called fib_with_memo(10)
. Okay, now what would you expect to see in the memo-map?
It would be this, right?:
0 → 0
1 → 1
2 → 1
3 → 2
4 → 3
5 → 5
6 → 8
7 → 13
8 → 21
9 → 34
10 → 55
Now look at the keys. It’s just 0 to 10, right? In order, with no holes. Just like an index.
So… why do you need a map? You’re just “mapping” an index to a value. That’s just what a basic array, or a vector, does.
Let’s imagine that we’re using a vector for the memoizing, with the “key” just being the index. That means after memoizing fib_with_memo(10)
, the vector would contain:
[index 0 ] → 0
[index 1 ] → 1
[index 2 ] → 1
[index 3 ] → 2
[index 4 ] → 3
[index 5 ] → 5
[index 6 ] → 8
[index 7 ] → 13
[index 8 ] → 21
[index 9 ] → 34
[index 10] → 55
So, after fib_with_memo(10)
, the vector contains 11 elements, and the element at index n
—or, vec[n]
—is the nth Fibonacci number.
Simple, right?
So how do you tell whether a number is already memoized? Again, simple, just check the vector’s size. If we call fib_with_memo(13)
, the vector must have at least 14 elements in it for vec[13]
to be valid. So:
if (vec.size() > n)
return vec[n];
else
// calculate the nth fibonacci number
In fact, the implementation becomes trivial, too. That’s because if the function is called with an unmemoized n
, all you need to do is resize the vector to n + 1
, and fill in all the numbers between the old size and the new size. Something like:
auto fib_with_memo(int n) {
// are you sure you don't want to check for n < 0?
// trivial solutions, don't even bother diving into the calculation
if (n < 2)
return big_t(n);
static auto vec = std::vector<big_t>{0, 1}; // start with fib(0) and fib(1)
// note that if you really want, you could start with more than just
// 0 and 1 - after testing your real-world use case, you could determine
// that you will need, say, up to fib(8), so you could precompute that
// much and put it in the vector initializer list above
//
// if you make the non-memoized fib function constexpr, then you could
// even be *really* clever, and do something like:
// static auto vec = std::vector{fib(0), fib(1), fib(2), fib(3), ...
// for as many precomputed values as you like.
//
// and then you could be *REALLY* clever (if you’re using a more modern
// version of c++), and do something like:
// static auto vec = [](auto initial_size) {
// auto v = std::vector<big_t>(std::min(2, initial_size + 1));
// v[0] = 0;
// v[1] = 1;
//
// for (auto p = std::next(v.begin(), 2); p != v.end(); ++p)
// *p = *(p - 1) + *(p - 2);
//
// return v;
// }(4); // 4 to calculate fib(0) though fib(4), but you can use any number
// but that's getting *really* fancy.
// memoized, so use cached value
if (vec.size() > n)
return vec[n];
// not memoized, so calculate all the values between what's been cached,
// and what's desired
auto old_size = vec.size();
vec.resize(n + 1);
// now fill in everything from vec[old_size] to vec[n] inclusive,
// where vec[i] = vec[i - 1] + vec[i - 2]
return vec[n];
}
Obviously I haven’t profiled this design, but I’d bet it is hundreds, if not thousands of times faster than a similar design using a hash map.
Questions
Difference between a macro and a type alias
When you do:
#define type long double
you’re talking to the preprocessor here… not the compiler.
The preprocessor is stupid (as in, unintelligent), it doesn’t understand C++ (often the same preprocessor is used for C, C++, and sometimes even other languages, like Objective-C, and sometimes Fortran, etc.). It just deals with meaningless tokens. In this case, it just does a blind textual substitution. Everywhere type
appears in the code, it just gets literally replaced with long double
—literally as if you manually did copy-paste—with no regard as to whether that makes sense.
To see where that causes problems, consider the following code:
struct thing
{
std::string name;
std::string type;
};
type value =
type(0);
Everywhere type
appears gets replaced with long double
, so…:
struct thing
{
std::string name;
std::string long double; // compile error
};
long double value =
long double(0); // compile error
Note that it didn’t just replace type
when it is actually naming a type—which can still cause syntax errors, as it does for the second error above—it also replaced type
when it was the name of a variable in the class. That’s why the preprocessor is so dangerous. It’s dumb. It’s indiscriminate. It blindly replaces everything, even when it makes no sense.
Now, when you do:
using type = long double;
you’re talking to the compiler here, not the preprocessor, and that makes all the difference.
This creates a type alias that the compiler understands. The compiler understands C++. The compiler doesn’t just blindly replace text, it knows when type
is actually naming a type (and it’s not a variable name, for example), and even knows which type it’s naming (because namespace_1::type
may not be the same as namesapce_2::type
), and it does the replacement intelligently.
So when the compiler sees the original code:
struct thing
{
std::string name;
std::string type;
};
type value =
type(0);
it will not replace the type
in class thing
, because that’s not a type, it’s a variable name. And it will replace the type
in the last two lines… but not simply by literally replacing the text (which would break the cast on the last line). Instead, it will do the replacement intelligently. value
will be a long double
, and the last line is casting a 0
(an int
) to a long double
… everything just works.
In summary, the preprocessor is dumb. It just does blind text replacement, with no regard for what that text actually means… which often results in breaking code. By contrast, the compiler actually understands C++, so it knows when something is actually a type and when it’s not—and even which type it is—so it can do type substitutions intelligently, and not break code.
Refactored code
The standard on CodeReview is not to re-review code in the same question, but rather to post a new review request with the new code. So I can’t re-review your refactored code here. BUT! What I can do is give you suggestions to help you in your refactoring, and in general.
The number one most important thing you can do when coding, not just in this case, but always, is WRITE TESTS. This is the absolute most important skill as a programmer you could ever develop. If I were hiring, and I had two candidates, and one wrote tests and the other didn’t… I wouldn’t even look at the second, I wouldn’t even give them a shot, regardless of how “good” a coder they might be. A coder that doesn’t write tests would be worse than useless to me… yes worse than useless, because they’d actually be a net maintenance burden. I’d literally rather hire nobody than hire someone who doesn’t write tests.
Even if you’re just doing hobby programming, you should get in the habit of writing tests. I swear it will change your life. It makes everything so much easier. And you no longer have to wonder “does my code work?” You’ll just know. And if someone points out a bug you missed… no problem, just add a new test case that catches the bug, then fix it, and now you’re even more sure your code works.
Testing is so important, that I even recommend writing the tests before you write the first line of “real” code.
The first thing you should do is look into a proper testing system. I use Boost.Test but… it’s extremely heavyweight. (I mean, it’s awesome—by far the most powerful testing system—but, yeah, it’s not trivial to set up.) You might like Catch2. GoogleTest is another option but… meh.
The easiest way to set up Catch2 (other than using your system package manager, if you’re on Linux) is to download the two files mentioned at the start of the tutorial—catch_amalgamated.hpp
and catch_amalgamated.cpp
—and just put them in the same directory as your project. In that directory you should have (at least) fibonacci.hpp
, fibonacci.cpp
, fibonacci.test.cpp
, catch_amalgamated.hpp
, and catch_amalgamated.cpp
:
fibonacci.hpp
: your header
fibonacci.cpp
: your implementation
fibonacci.test.cpp
: your tests
catch_amalgamated.hpp
: the Catch2 header
catch_amalgamated.cpp
: the Catch2 code
and when you compile, you need to make sure to also compile the Catch2 code:
# only need to do this one time, which is good, because it will be slow
$ g++ --std=c++17 --pedantic -Wall -Wextra -c catch_amalgamated.cpp
# should do these every time
$ g++ --std=c++17 --pedantic -Wall -Wextra -c fibonacci.test.cpp
$ g++ --std=c++17 --pedantic -Wall -Wextra -c fibonacci.cpp
$ g++ --std=c++17 --pedantic -Wall -Wextra -o fibonacci.test fibonacci.test.o fibonacci.o catch_amalgamated.o
$ ./fibonacci.test
Or, better yet, put all those commands in a shell script or makefile. Also, you could use --std=c++11
or --std=c++14
or even --std=c++20
, depending on what version of C++ you want to use.
Here’s a simple build script you could use:
#!/bin/sh
# Stop immediately if something fails.
set -e
# Only recompile Catch2 if we have to.
if [ ! -f catch_amalgamated.o ]
then
printf '%s\n' 'g++ --std=c++17 --pedantic -Wall -Wextra -c catch_amalgamated.cpp'
g++ --std=c++17 --pedantic -Wall -Wextra -c catch_amalgamated.cpp
fi
# Compile everything else.
printf '%s\n' 'g++ --std=c++17 --pedantic -Wall -Wextra -c fibonacci.test.cpp'
g++ --std=c++17 --pedantic -Wall -Wextra -c fibonacci.test.cpp
printf '%s\n' 'g++ --std=c++17 --pedantic -Wall -Wextra -c fibonacci.cpp'
g++ --std=c++17 --pedantic -Wall -Wextra -c fibonacci.cpp
# Link.
printf '%s\n' 'g++ --std=c++17 --pedantic -Wall -Wextra -o fibonacci.test fibonacci.test.o fibonacci.o catch_amalgamated.o'
g++ --std=c++17 --pedantic -Wall -Wextra -o fibonacci.test fibonacci.test.o fibonacci.o catch_amalgamated.o
Put that in build.sh
, make it executable, and you can just do:
$ ./build.sh
$ ./fibonacci.test
That will speed up your workflow, and anything that speeds up your workflow is a good thing. (Even better than a build script would be a proper makefile, but that’s too complicated for here.)
The test file doesn’t need to be complex:
// fibonacci.test.cpp
#define CATCH_CONFIG_MAIN
#include "catch_amalgamated.hpp"
#include <random>
#include "fibonacci.hpp"
// You should always test important edge cases:
TEST_CASE("Fibonacci(0) memoized is 0")
{
CHECK(your_ns::fib_with_memo(0) == 0);
}
TEST_CASE("Fibonacci(1) memoized is 1")
{
CHECK(your_ns::fib_with_memo(1) == 1);
}
// Doesn't hurt to test a few random cases:
TEST_CASE("Fibonacci(10) memoized is 55")
{
CHECK(your_ns::fib_with_memo(10) == 55);
}
TEST_CASE("Fibonacci(20) memoized is 6765")
{
CHECK(your_ns::fib_with_memo(20) == 6765);
}
// Test the *property* of Fibonacci numbers:
TEST_CASE("Fibonacci(n) memoized is Fibonacci(n - 1) + Fibonacci(n - 2)")
{
constexpr auto lower_limit = 5;
constexpr auto upper_limit = 50; // or maybe as high as 90-ish, if you like
constexpr auto number_of_trials = 3;
auto rng = std::mt19937{std::random_device{}()};
auto dist = std::uniform_int_distribution<int>{lower_limit, upper_limit};
for (auto trial = 1; trial <= number_of_trials; ++trial)
{
// random number between lower and upper limit, inclusive
auto const n = dist(rng);
DYNAMIC_SECTION("n = " << n)
{
auto const fib_n = your_ns::fib_with_memo(n);
auto const fib_n_minus_1 = your_ns::fib_with_memo(n - 1);
auto const fib_n_minus_2 = your_ns::fib_with_memo(n - 2);
INFO("fib(n) = " << fib_n);
INFO("fib(n - 1) = " << fib_n_minus_1);
INFO("fib(n - 2) = " << fib_n_minus_2);
CHECK(fib_n == (fib_n_minus_1 + fib_n_minus_2));
}
}
}
Test cases should be as simple as possible. Indeed, the last case is pretty complex for a test case in my opinion, but most of it is just generating random values for n
and recording the intermediate results so you can see exactly where things went wrong.
At first this shouldn’t compile, because fib_with_memo()
shouldn’t exist yet (assuming you’re writing the tests first, which you normally should). So you just write some placeholders:
// fibonacci.hpp
#ifndef INCLUDE_GUARD_fibonacci
#define INCLUDE_GUARD_fibonacci
namespace your_ns {
using big_t = unsigned long long int;
big_t fib_with_memo(int);
} // namespace your_ns
#endif // include guard
// fibonacci.cpp
#include "fibonacci.hpp"
namespace your_ns {
big_t fib_with_memo(int)
{
return {};
}
} // namespace your_ns
Now it should compile and run… and generate a bunch of test case failures. But that’s good! A failing test case means that at least something is being checked, and it probably means that a bug was found.
I got 2 cases passed, and 3 failures. The 2 passes are because fib_with_memo(0)
returns 0
… which by fluke happens to be correct, and because since fib_with_memo(anything)
returns 0
, that means fib_with_memo(anything) == fib_with_memo(anything - 1) + fib_with_memo(anything - 2)
is always true. This happens sometimes as you’re developing; broken code happens to return the “right” answer… or at least an answer that “works”. Other tests should catch that problem, though.
So now you start to develop your Fibonacci algorithm. Let’s say as a first step, you do:
big_t fib_with_memo(int n)
{
if (n < 2)
return n;
return {};
}
Compile and run… now 3 tests pass, and 2 fail. Progress!
This is how you do it. Take tiny steps, and check as you go. It may sound tedious, but once you get in the habit, it becomes automatic.
So the next thing I’ve done is just copy-pasted your refactored code into fib_with_memo()
(I also needed to #include <vector>
, and I needed to rename big
to big_t
because I didn’t notice I’d got the name wrong), and…
ALL TESTS PASSED!
Congratulations! 👍 Check it in!
So your refactored code works… but… can it be improved? Well, that’s a question for another code review. Catch2 also supports benchmarking, so you can even do performance checks in your testing code, and as you make changes, you can see if things speed up or not.
But the point is, if you adopt this style of development—testing, testing, testing, testing—you will write better code, quicker, and when it comes time to refactor, you can do so with confidence because if you “break” anything, the tests will catch it immediately. It’s such a nice way of coding that once you get in the habit, you can’t quit; you feel naked and unsteady when you’re trying to code without tests.
Some other tips:
I suggested putting the memo map (or vector) in the function as a static
variable, and that’s definitely what you should do for the final code. However, while developing, you could pull it out and make it a global, so you can inspect what’s in the map/vector directly, and make sure that it’s working properly. It’s okay to break stuff while you’re working on it, once you remember to put things back in order when developing/testing is done.
If you do pull the memo map/vector out and make it a global (for development only), then you can observe its capacity to make sure it’s not wasting space. You can also use a tracing allocator to track allocations, and see if you’re losing efficiency there.
A simple tracing allocator is:
template <typename T>
class tracing_allocator
{
public:
using value_type = T;
// not needed in C++17
using propagate_on_container_move_assignment = std::true_type;
auto allocate(std::size_t n)
{
std::cout << "allocating " << n << " bytes\n";
return std::allocator<T>{}.allocate(n);
}
auto deallocate(T* p, std::size_t n)
{
std::cout << "deallocating address " << static_cast<void*>(p) << ", which is " << n << " bytes\n";
return std::allocator<T>{}.deallocate(p, n);
}
};
template <typename T1, typename T2>
constexpr auto operator==(tracing_allocator<T1> const&, tracing_allocator<T2> const&) noexcept
{
return true;
}
// not needed in C++20
template <typename T1, typename T2>
constexpr auto operator!=(tracing_allocator<T1> const&, tracing_allocator<T2> const&) noexcept
{
return false;
}
And you’d use it by declaring your vector as:
static std::vector<big, tracing_allocator<big>> mem = {0, 1};
And now you will get trace statements printed to standard output, for every allocation (and deallocation). This will help you observe how your memo map/vector is growing, and whether there are unnecessary/overlarge allocations happening.