This got bumped bac up to the top of the page, so I’ll do some necromancy. AlwaysAlways, always, always check for buffer overflows! One slipped through this code review.
One problem I don’t see others having mentioned, and that remains in your revised version, is that in the conditional,
while (*dst && size > 0 && size--)
and in the edited version:
while (*dst && size > 0)
You dereference *dst before checking that you are at the end of the array. This potentially causes the program to read the element one past the end of the buffer, which is Undefined Behavior. An example of what might go wrong is, if the buffer ends at a page boundary, one past the end could read the first byte of an unmapped page of memory and cause a segmentation fault.
As for the algorithm: On a modern CPU, copying a known number of bytes is more optimal than checking each character of a string and branching. I suggest you find the length of both src and dest (use strnlen_s() if you are allowed to, to avoid overrunning a possibly-unterminated input string, and otherwise reimplement that), calculate the number of bytes of src to copy and where to copy to with a bit of arithmetic, and use memcpy() (or memmove() if you really, truly need to allow the strings to overlap). That is both faster and safer.