It is highly unrecommended to write your own code in asmassembly now since, in most cases, gcc -O3gcc -O3 does magic. But in the '80s‘80s it was believed that compiled C code takes 4(?) times or more than a well-organized assembly equivalent. When and why does coding cC for performance as the primary choice become the received practice? Which compiler first made it, on which architecture?
Are there any high level language compilers (adaAda/cobolCOBOL/fortranFortran/pascalPascal) other than cC families which generates optimized code outperforming average assembly programmers?