A very good 8K video display provides only 33.18 megapixel resolution when the "native resolution" of the human eye is commonly cited as 576 Mp. Not all this information is processed by human brain, but the brain and already the eye itself are both very good in selecting what is important and what is not. Right the same cells that capture the light in retina are also the neural cells and start signal processing immediately. The eye instantly turns itself to place the object of interest into the center of retina, where the resolution is much higher.
And the data transfer rate of this good monitor is 80 gigabytes per second. It can be compressed to be much less, but the latency will soon be enough to make the difference.
This difference will likely put a remote pilot into inferior position, constantly claiming that the camera of the drone never points into the right direction on time. It may be cases where it matters less, but as long as the outcome of the combat depends on ability to see and aim at the enemy first, the remote pilot does not have many chances.
One of my tasks was to observe experimental robots with remote camera, with the goal of stopping them in time if they do something wrong. While the task was easy to manage when watching the robots directly with finger on the button (reaction time of 200 ms or about), with camera never showing the center of the problem properly and all latency added, it was totally mission impossible.