From an algorithmic point of view the time complexity is usually just the number of memory accesses. From this point of view, quicksort is good in average O(n log n), but not the best algortihm.
First, because it's worst complexity is O(n2) and does meet this cases in real life (typically for reverse-sorted or constant inputs) when some others are O(n log n) in the worst case.
What makes quicksort a good algorithm is that on real computers, not all memory accesses take the same time.
The main memory, SDRAM has a latency that seems very long from the CPU's point of view (typically hundreds of CPU cycles).
Fortunately, large portions of the memory can be requested and put in a smaller, quicker memory: the cache (CPUs often have multiple layers of cache but I'll refer to all of these as "the cache").
For this reason a computer runs faster when it works on data that is already (or still) in the cache. When it fetches new memory region, the CPU has to wait for the main memory to answer, which makes a notable performance hit, called a cache miss.
So an important factor for sorting large arrays quickly is memory accesses pattern. For this, quicksort is really good and generate very few cache misses.
Why ? Because it scans the array sequentially (and repeats on smaller segments). The most of its memory accesses take place in the cache.
Heap sort on the other end needs to maintain a search tree-like structure and has a more "random" memory access pattern.
Even merge sort which is very efficient on cache still needs additional memory accesses, because it cannot work in place: either because it needs to be recursive keep temporary data in the stack or it needs linked lists (which means pointers and indirect memory accesses and then cache misses).