I am working on a new backend for a programming language using LLVM IR. This language makes a distinction between basic values and pointers to nodes on the heap, and uses a copying collector for memory management. In an existing backend for this language, basic values are kept on the system stack but pointers are kept on a separate stack. This makes finding the root nodes of the active set in garbage collection trivial, as we can just traverse the separate stack without inspecting stack frames/maps. This backend thus reserves (a) the system stack pointer, (b) a separate stack pointer for the pointer stack; (c) a heap pointer to quickly allocate new nodes.
Ideally I would like to keep as close to the existing backend as possible, so that at least to start with I could reuse the existing garbage collector. This would for example be possible if I could instruct LLVM to keep an extra stack to spill ptr values to. However, I have not been able to find anything that may enable me to do this. As an alternative I have looked at LLVM's GC intrinsics. From what I understand, I could use LLVM safepoints, but this would require me to unwind the stack in a more complicated and slower manner than the current backend does. I see that there are benefits as well (e.g., two stacks and a heap makes it difficult to decide where they should be allocated; keeping pointers in stack frames makes debugging easier), but at this point they are less important. Is there no way to mimic what the current backend does?
A related point concerns memory allocation. To mimic what the current backend does, I now malloc a heap at the start of the program and keep a heap pointer in a register. I'm wondering if there is a more idiomatic way to do this in LLVM (instead of manually spelling out the increase of the heap pointer each time). For example, would it be possible to use alloca with a custom address space instead? The docs say that alloca allocates stack space, but it is not clear to me what the behavior is with a custom address space. Would it be possible to use a custom address space and use alloca for heap allocation? (This may seem like a separate question, but the idea for this comes from the safepoints document, which suggests using a separate address space for pointers that need to be visited by GC – so it may be related.)
I don't think the constraints are very exotic: (1) the need to distinguish basic values and pointers; (2) fast allocation; (3) ideally, fast unwinding of the stack. What is the most idiomatic way to accomplish this in LLVM? And, are there perhaps any toy examples around on the web to illustrate it?
An added complication is that I want to target WebAssembly in addition to x86. I can imagine that implementing a separate address space as outlined above would be doable for x86 by reserving one register for the heap pointer. But WebAssembly has no concept of registers. I think it should be possible to work around this, either by keeping the heap pointer in a global variable or by writing a transformation that turns the heap pointer into a function argument / return value. Nevertheless, please keep this in mind in your answers if possible.