Get Rid of Unused Arguments
#Get Rid of Unused Arguments
ThereThere are arguments passed to this function that are not used. It appears that the foreground array mF doesn't get read or modified, so remove it from the list of arguments as it's just confusing.
Naming
#Naming
TheThe names of your function's arguments are extremely cryptic. mO is the output buffer? If so, name it outputBuffer or outputImage or whatever's appropriate.
Whenever you see something like this in your code:
// mB - Background, mF - Foreground
you know you've failed to name your variables properly. They should just be background and foreground. (Also background and foreground of what? Was this originally a compositing operation or something?)
Also don't do this:
scalingFactor = _mm_set1_ps(scalingFctr);
How is anyone reading the code supposed to tell the difference between scalingFactor and scalingFctr? There's nothing about either name that distinguishes its use or difference to someone reading the code. (And remember, you may have to read this code after months of not looking at it!)
Performance
#Performance II would try doing this operation on the GPU. It was built to do this exact sort of thing. In OpenGL you could simply upload the float image as a texture, then draw to a texture-backed FBO (Frame Buffer Object) where the backing texture is an 8-bit per channel image. You'd simply draw a textured quad with the desired texture applied.
The reason I recommend this approach is that while using SIMD instructions will improve performance by 2-16 times, and using multiple threads will further improve performance, you're unlikely to be able to beat a GPU running hundreds to thousands of cores.