Why do so few source level optimizers exist?

Question

Why do optimizing compilers never offer backends in the source language itself? Isn't targeting the source language much easier than targeting assembly? Also, couldn't a source-to-source optimizer perform canonicalizations, as well?

Are there technical limitations for doing so that I am unaware of, or is it just that a lack of demand for such a thing does not justify the effort to implement it?

What will be use of such thing? Generated code would be hardly intelligible. — talex
– talex, Commented Sep 10 at 5:23
" Isn't targeting the source language much easier than targeting assembly?" — why do you expect that it would be? — Karl Knechtel
– Karl Knechtel, Commented Sep 10 at 5:30
What is the use-case for a source-level optimiser? The generated source code still has to be compiled to a target language eventually, so any optimisations can be performed when that happens. — kaya3
– kaya3, Commented Sep 10 at 11:18
What kind of source language / source level are you thinking of? There exist a few JavaScript-to-JavaScript compilers, which can and do perform optimisations and canonicalisation. — Bergi
– Bergi, Commented Sep 10 at 15:00
Don Knuth in "Structured Programming With Goto Statements" (1974) writes “For some reason we all (especially me) had a mental block about optimization, namely that we always regarded it as a behind-the-scenes activity, to be done in the machine language, which the programmer isn't supposed to know. This veil was first lifted from my eyes in the Fall of 1973. when I ran across a remark by Hoare [42] that, ideally, a language should be designed so that an optimizing compiler can describe its optimizations in the source language. Of course! Why hadn't I ever thought of it?” — Thumbnail
– Thumbnail, Commented Sep 18 at 16:25

Peter Mortensen · Accepted Answer · 2025-09-11 23:40:39Z

21

A small thing that Eldritch Conundrum's and Steve's answers did not mention, and I would like to add:

Many optimizations are target-specific.

There are many things in play here, like cache lines, alignment, or other CPU-specific features. Are you optimizing for a x86 or an ARM processor? Which generation of either? Then there is the question whether you optimize for speed, or for size. Plus, many optimizations are on the machine-code level, and don't translate back into your HLL all that well.

Which would mean that your source-level optimizer would take your one source base, and turn it into several separate source bases, which you would now have to maintain separately. Spattered with lots of changes, and possibly lots of inline assembly as well, making them less maintainable, as the other two answers pointed out.

Source code primarily serves the needs of the developer. It needs to be readable, maintainable, and understandable. Many optimizations that made sense "in the olden days" have long since become unnecessary, sometimes even counterproductive.

Write code so the next generation of developers can work with it. They will be about as smart as you were when you started. Optimizers, however, will only ever get better at optimizing clearly structured code.

edited Sep 11 at 23:40

Peter Mortensen

1174 bronze badges

answered Sep 10 at 13:54

DevSolar

3264 bronze badges

1

$\begingroup$ A good example of target-specific optimization is an answer I was just updating with some RISC-V testing: Signed saturated add of 64-bit ints? - it doesn't have FLAGS / condition-codes, so even source that uses __builtin_saddll_overflow has to check for overflow "manually". And writing portable ISO C, I don't know an idiom compilers recognize for signed overflow, so you can't get good asm on x86 or ARM, but different strategies are optimal on each, like (sum<a) == (b<0) vs. XORs and checking sign bits. $\endgroup$

Peter Cordes
– Peter Cordes

2025-09-11 03:10:36 +00:00
Commented Sep 11 at 3:10
2

$\begingroup$ Of course a better language like Rust avoids that specific problem entirely by providing a portable i64.checked_add which does a wrapping signed add that produces the sum and a boolean overflow flag. And directly an i64.saturating_add, solving the whole problem and telling the compiler exactly what you want, so it can use target-specific recipes for the whole thing. doc.rust-lang.org/std/primitive.i64.html#method.saturating_add $\endgroup$

Peter Cordes
– Peter Cordes

2025-09-11 03:16:38 +00:00
Commented Sep 11 at 3:16
$\begingroup$ If your HLL has a C-like preprocessor, you could abstract the machine-specific stuff behind #if blocks rather than writing separate files — but of course that still has the issue of making the code uglier and less maintainable. $\endgroup$

dan04
– dan04

2025-09-18 20:10:00 +00:00
Commented Sep 18 at 20:10
1

$\begingroup$ @dan04: I wasn't referring to things that need to be machine-specific by design. I meant code that would be optimized this way on this machine, and that way on another. Alignment, endianess, instructions supported in hardware / assembly on one platform but not the other. You would lose the abstraction provided by the HLL in the first place and, if you bring it to a point, completely disjunct trees of assembly source. With all the maintainability that would bring. $\endgroup$

DevSolar
– DevSolar

2025-09-19 09:17:26 +00:00
Commented Sep 19 at 9:17

Add a comment |

Chris Bouchard · Accepted Answer · 2025-09-11 05:45:40Z

15

This is a bit of a frame challenge…

We Do Have High-Level Source-to-Source Optimizers!

We just don't call them that—we call them linters.

For example, ruff is a popular linter for Python. It has multiple lint rules targeting performance, and can automatically rewrite your code to fix many of them. These range from expression-level tweaks like unnecessary-list-cast (PERF101), which removes an unnecessary list() call when the result will just be iterated anyway; to larger transformations like manual-dict-comprehension (PERF403), which rewrites a for-loop as a dictionary comprehension (which are usually faster in Python); and quadratic-list-summation (RUF017), which rewrites a common but inefficient list-flattening expression to a more performant one.

answered Sep 11 at 5:45

Chris Bouchard

4112 silver badges5 bronze badges

1

$\begingroup$ Yes! Linters with suggestions being the popular form of source-to-source compilers, points to the real issue: we need the human to validate and choose the optimizations that are desirable in each context. It's not that source-to-source don't exist, it's that their UX does not match the usual idea of a compiler. $\endgroup$

Eldritch Conundrum
– Eldritch Conundrum

2025-09-13 10:37:20 +00:00
Commented Sep 13 at 10:37

Add a comment |

Glorfindel · Accepted Answer · 2025-09-11 13:06:00Z

Depends. Do you intend to keep maintaining the generated source code, or not?

If you do, you might run into some problems:

Loss of readability

Many optimizations can make source less readable. For example, optimizations on loops, like loop fission, loop fusion, loop inversion, loop interchange, loop reversal, loop unrolling (yes, I'm just going through the list of optimizations article on Wikipedia), etc.

Also, optimizations that need to first desugar language-level constructs can be represented back in source code, but if the result cannot be "re-sugared", it will increase verbosity a lot.

Correctness after further changes to the optimized source

Some optimizations are only valid depending on subtle preconditions in your program, that the compiler was able to prove. Representing the result of those optimizations back in source code does not document the information that these preconditions exist.

If you try to modify the optimized generated code even just a little, by say modifying the value a constant, then maybe now the code does something wildly different and unexpected, because the optimized code has been specialized for the previous value of that constant.

The fact that compilers work only one way on a given program was a guarantee that all those readability-destroying optimizations were safe to perform. Not true anymore if you want to generate maintainable source code.

Ad even if you don't intend to keep maintaining the generated source code, having to preserve source code as a target will severely limit the kind of optimizations that you can do.

When working on data of any kind, its representation matters a lot, and compilers are no exception. They represent programs in various ways, with sometimes multiple levels of Intermediate Representations that can be the place of choice for optimizations.

For example, most optimizations that operate on machine code cannot be represented back in source code form. Anything that operates on assembly instructions or registers will just not be available to a source-to-source optimizer.

All that being said, there is a greater than zero number of optimizations that you could do:

Some transformations that are high-level enough, like common sub-expression elimination, constant folding and constant propagation, and some source-level inlining... (With the big above caveat about readability and brittleness of correctness w.r.t. change!)
And maybe even higher-level transformations that might be specific to your program, which you'll have to hardcode in your optimizing compiler...

But there are other, better ways of expressing those than writing a source-to-source compiler. Some languages let you write macros, which could play this role instead, while keeping your code more readable.

There's also the fact that a transformed source code could break the "pattern detection" for certain other optimizations — Seggan
– Seggan, Commented Sep 11 at 20:00

Steve · Accepted Answer · 2025-09-10 10:59:05Z

Applying optimisations to a high-level language, in order to output the same language, is at cross purposes to what HLLs are designed to facilitate.

Firstly, one point of HLLs is to allow programmers to read and write in a manner that is most ergonomic to the programmer.

Purely mechanical optimisations and simplifications are liable to destroy the quality of the source code and limit the extent to which it can be read back or readapted later.

For example, as well as choosing names carefully and laying code out in a particular suggestive way (neither of which has any mechanical bearing on the execution, and are discarded in compilation), I also sometimes write code in a somewhat inefficient or verbose way because I know there's a good chance it will need to be readapted in a way that will rely on what is currently an inefficiency or verbosity, or particular parts of the code extracted from its context for bench testing.

In other words, some inefficiencies are present precisely for my own ergonomics, and whose justification only I know as an expert programmer (and which cannot be inferred solely from information available to the compiler). I wouldn't want these "optimised out".

A second point of HLLs is to facilitate optimisation at the compilation stage. For this reason, verbosity in source code is often no more inefficient in execution.

Moreover, the possible optimisations often have no representation in the HLL, because they are not specific in the HLL in the first place - register use is the classic example of something that HLL compilers handle automatically for the programmer, and the manual specification of which has no representation in the HLL.

If an optimising transform were limited to what can be represented in the HLL, there would often be very little room for optimisation, negating the usefulness of making and using such a tool. You want a compiler to be able to analyse across every possible level, and to express the result in the most general way possible (which is usually machine code, occasionally an intermediary code).

Moreover, programmers are often able to optimise HLLs by hand because they have access to implicit information and meaning that is not recorded in source code (or sometimes, is recorded in only a human-readable way), so just because there is any perceived opportunity to optimise HLL source code in its own terms, does not necessarily mean the optimisation can be done mechanically by a tool.

One interesting use case here is polyfilling transpilers which are common in javascript and exist for verilog and vhdl. In all cases the commonality is lack of control over the target compilation environment. They usually end up targeting a low level subset of the HLL that is particularly easy to evaluate. — user1937198
– user1937198, Commented Sep 10 at 13:50
@user1937198 you are making me extremely glad that nothing like target triples exist for Javascript. I'm literally queasy at the thought of having every version of every browser as a separate compile target. — Jared Smith
– Jared Smith, Commented Sep 12 at 11:46

Simon Farnsworth · Accepted Answer · 2025-09-10 14:13:17Z

2

Some compilers do offer source to source transforms as a way of optimizing code, but the resulting output is a pure intermediate form, and not something saved to disk for further human editing.

For example, GHC offers rewrite rules, which are source-to-source transforms. The intent behind them is that you write your code as high-level clear code, and GHC knows how to translate this into code that's optimal.

The classic example is list fusion, where rewrite rules are used to turn functions like map (list to list transform), filter (list to list transform) and others into build/foldr form, and then to optimize from there.

This, in turn, lets you write code like sum $ map tax_rate $ map item_value sales_list, which naïvely would generate two intermediate lists for the two maps, and have GHC work out by following rewrite rules that this is best implemented as something closer to (a while since I did Haskell seriously, so this may be syntactically wrong) foldr (\item acc -> acc + (tax_rate $ item_value $ item)) 0 sales_list, removing both intermediate lists.

answered Sep 10 at 14:13

Simon Farnsworth

1,2052 silver badges8 bronze badges

1

$\begingroup$ While it is true that GHC’s rewrite rules are expressed in source Haskell, they are actually applied to GHC Core, so I’m not sure this really counts. $\endgroup$

Alexis King
– Alexis King ♦

2025-09-10 14:34:11 +00:00
Commented Sep 10 at 14:34
$\begingroup$ There's no particular reason why they couldn't be applied to source Haskell, too - it's just that if you're not going to output source Haskell, you might as well apply them to GHC Core directly, rather than applying to source then transforming source to GHC Core. $\endgroup$

Simon Farnsworth
– Simon Farnsworth

2025-09-10 15:03:36 +00:00
Commented Sep 10 at 15:03
$\begingroup$ GCC's GIMPLE middle-end optimizer can output its tree in C syntax along with the GIMPLE, e.g. godbolt.org/z/G9f6zvnbv (probably not the most interesting example of a function to look at optimization of, but at least it's short, saturating 64-bit integer add.) But it's not fully usable C syntax, there's stuff like _9 = .ADD_OVERFLOW (a_7(D), b_8(D)); so this isn't really source-to-source optimization. $\endgroup$

Peter Cordes
– Peter Cordes

2025-09-11 03:22:33 +00:00
Commented Sep 11 at 3:22

Add a comment |

dan04 · Accepted Answer · 2025-09-18 20:08:32Z

Lack of a use case

What do you do with your optimized source code once you've generated it?

Compiler optimizers are usually configured to optimize execution speed, rather than readability or maintainability. (Well, unless you count linters.) So you're probably not going to want to throw away your original source code to replace it with the optimized version (which might, for example, unroll a nice readable for loop into a bunch of repetitive statements).

You could pass the optimized source code to the compiler to make it into an executable. But if you can do that, you can just pass the original source code to the same compiler, and that way you only need to call the compiler once instead of twice (to generate optimized source code and then compile it).

There are other target languages available

Your question asks “Isn't targeting the source language much easier than targeting assembly?” But this is a false dichotomy. Compilers (and interpreters) can convert source code into other things, including:

A “tokenized” format that represents keywords, number/string literals, variable references, etc. in a pre-parsed binary form that's easier to deal with than raw source code text.
An abstract syntax tree.
A bytecode or other intermediate language for a virtual machine. For example, UCSD Pascal P-code, Java bytecode, Microsoft .NET's Common Intermediate Language, Python bytecode, or LLVM Intermediate Representation.

Perhaps multiple intermediate representations are used in different passes in the conversion process. Regardless of the exact form it takes, the point of an intermediate representation is to be easier for the compiler to work with (or optimize) than either the original source code or the final machine code. This is especially the case if...

The source language can't directly represent the optimizations

Suppose, for example, that you've taken Dijkstra to heart and banished goto from your language. That's fine, but at some point you have to convert “structured” programming constructs to the jumps and branches that the actual machine language uses. And, what if you do this at the optimization stage, converting a while loop or tail call to a goto? You can't represent this directly in your source language because it doesn't have goto. You'd have to “pessimize” the code to something that can be represented.

And even if your high-level language does have goto, it probably doesn't have built-in syntax for distinguishing register/memory variables, branch prediction hints, or vectorization. These things matter for optimizing machine code, but their low-level and machine-specific nature makes them less useful to have in a high-level language.

Stack Exchange Network

Why do so few source level optimizers exist?

6 Answers 6

We Do Have High-Level Source-to-Source Optimizers!

Lack of a use case

There are other target languages available

The source language can't directly represent the optimizations

You must log in to answer this question.

Hot Network Questions

Why do so few source level optimizers exist?

6 Answers 6

We Do Have High-Level Source-to-Source Optimizers!

Lack of a use case

There are other target languages available

The source language can't directly represent the optimizations

You must log in to answer this question.

Related

Hot Network Questions