Recursion, is the simplest mechanism from a code cleanliness standpoint. If speed is absolutely of the essence, then perhaps you can use a 2D array of problem paramtersparameters. Spawning a daughter process is then simply appending another item to the array. I once did an assembler tridiagonal tridiagonal solver, that was a big deal back in the day. The context needed per instance was 8words 8 words per level, and each subproblem was a third the size of the previous. This "library" was only popular because it beat the heck out of all the other implementations out there. But, thats that's a pretty rare situation in programming, so you needn't succumb to "premature optimization", just because this solution is available. Obviously for a few things its terrible overkill, like the recursion 101 example "compute the factorial". But for most apps, it is a a really elegant way to eliminate source code complexity.
AI have a simple spellcheckerspell-checker I use for an app, (where I want to give hints about correcting minor mispellingsmisspellings), where I compute the "distance" between two strings, allowing deletions and additions are allowed. This leads to a potentially large tree structure, and the branches are trimmed as we only care about close matches. With recursion, its maybe twenty lines of code (I have both Fortran and C versions). I think it would be messy otherwise. Heck it was much easier to program/debug verify, than it was to think about that tree!