miasm
https://github.com/cea-sec/miasm
This appears to be the most promising concrete solution. According to the project description, the library can:
- Opening / modifying / generating PE / ELF 32 / 64 LE / BE using Elfesteem
- Assembling / Disassembling X86 / ARM / MIPS / SH4 / MSP430
So it should basically:
- parse the ELF into an internal representation (disassembly)
- modify what you want
- generate a new ELF (assembly)
I don't think it generates a textual disassembly representation, you will likely have to walk through Python data structures.
TODO find a minimal example of how to do all of that using the library. A good starting point seems to be example/disasm/full.py, which parses a given ELF file. The key top-level structurei is Container
, which reads the ELF file with Container.from_stream
. TODO how to reassemble it afterwards? This article seems to do it: http://www.miasm.re/blog/2016/03/24/re150_rebuild.html
This question asks if there are any other such libraries: https://reverseengineering.stackexchange.com/questions/1843/what-are-the-available-libraries-to-statically-modify-elf-executables
Related questions:
- https://reverseengineering.stackexchange.com/questions/185/how-do-i-add-functionality-to-an-existing-binary-executable
- https://askubuntu.com/questions/617441/how-can-edit-a-executable-file-linux
I think this problem is not automatable
I think the general problem is not fully automatable, and the general solution is basically equivalent to "how to reverse engineer" a binary.
In order to insert or remove bytes in a meaningful way, we would have to ensure that all possible jumps keep jumping to the same locations.
In formal terms, we need to extract the control flow graph of the binary.
However, with indirect branches for example, https://en.wikipedia.org/wiki/Indirect_branch , it is not easy to determine that graph, see also: Indirect jump destination calculation