The reader should be able to know from context, even if you don't add additional sentences.
But anyway, you could add a comma after the "Yeti":
"We started interviewing a Yeti with a microphone." = ambiguous
"We started interviewing a Yeti, with a microphone." = excludes the picture on the left
(You could also rearrange it: "We, with a microphone, started interviewing a Yeti.")
Or, you could change the "a" to "our" (which clarifies possession of the microphone):
"We started interviewing a Yeti with a microphone." = ambiguous
"We started interviewing a Yeti with our microphone." = still ambiguous, but more likely the picture on the right
If you really need to be clear on who's holding the microphone, maybe try something like:
"We handed the microphone to the Yeti and started interviewing him." = the picture on the left
"We held the microphone up to a Yeti and started interviewing him." = the picture on the right
If you wanted to convey the sense of the third picture, maybe:
"We started interviewing a Yeti, each party using separate microphones."