The code works ?
Except if the child object happens to be Nothing, because then your code will not have allocated the shadow space and thus the instructions at the CoTaskMemFree section will treat the stack memory wrongly!
The solution is to bring the sub rsp, 0x28 instruction to just before the test for Nothing. Make sure you don't forget to adjust the hardcoded jump offset!
Assembly style
You are mentioning QWORD PTR in cases where the assembler already knows the size from the register involved. eg. Better change mov rax, QWORD PTR [rcx+0x8] into mov rax, [rcx + 8].
You have written comments about things that are obvious:
mov rax, QWORD PTR [rcx+0x8] # gets the refcount
dec rax # decrement refCount
mov QWORD PTR [rcx+0x8], rax # update refCount in obj
cmp rax, 0x0 # compare refCount with 0
It does not add anything useful to comment about decrementing, updating, and comparing the refcount as that is already obvious from looking at these dec, mov, and cmp instructions.
In all, the program text 'looks' heavy! You can make it a lot easier for the reviewer to read if you would:
- drop some redundant comments
- insert some more blank lines
- use a tabular format where the instruction's operands get aligned to their own column
- not mention
QWORD PTR where it is not strictly needed
- have whitespace around the '+' operator
mov rax, [rcx + 0x8] # gets the refcount
dec rax
mov [rcx + 0x8], rax
cmp rax, 0x0
je +0x1 # jump if refCount is zero to release section
ret # early return from function
# jumps to here - IUnknown::Release of child object if present
mov [rsp + 0x08], rcx # cache RCX
mov rcx, [rcx + 0x10] # read the child object into RCX
cmp rcx, 0x0 # ensure the child object hasn't been set to Nothing
je +0xD # jump if child object is Nothing
mov rax, [rcx] # get vtable pointer from [RCX]
mov rax, [rax + 0x10] # get IUnknown::Release function pointer (3rd vtable entry = (3-1) * 0x08 = 0x10) (a)
sub rsp, 0x28 # allocate shadow space
call rax
# jumps to here - CoTaskMemFree section
mov rcx, [rsp + 0x30] # restore RCX to the parent object
movabs rax, 0x8877665544332211 # move addrCoTaskMemFree into RAX (b)
call rax
add rsp, 0x28 # deallocate shadow space
xor rax, rax
ret # return from function
(a) Have mercy on the reader that does not have a super wide screen. Split extra long comments so they span more than one line, without disrupting the nice tabular format.
(b) I would improve this comment further emphasizing 0x8877665544332211 is indeed a placeholder.
Assembly do's
The xor rax, rax can be replaced by the 1 byte shorter xor eax, eax that also will empty the full 64 bits of RAX.
The cmp rcx, 0x0 (that ensures the child object hasn't been set to Nothing) is better written as test rcx, rcx (shorter code) and then followed by jz +0xD instead of je +0xD since using jz is more idiomatic following a test instruction.
The cmp rax, 0x0 on the 4th line can be omitted as the dec rax on the 2nd line already defines the necessary zero flag for your je +0x1. Again, more idiomatic to use jz +0x1 following that dec instruction.
mov rax, [rcx + 0x8] # gets the refcount
dec rax
mov [rcx + 0x8], rax
jz +0x1 # jump if refCount is zero to release section
ret # early return from function
# jumps to here - IUnknown::Release of child object if present
mov [rsp + 0x08], rcx # cache RCX
mov rcx, [rcx + 0x10] # read the child object into RCX
test rcx, rcx # ensure the child object hasn't been set to Nothing
jz +0xD # jump if child object is Nothing
mov rax, [rcx] # get vtable pointer from [RCX]
mov rax, [rax + 0x10] # get IUnknown::Release function pointer
# (3rd vtable entry = (3-1) * 0x08 = 0x10)
sub rsp, 0x28 # allocate shadow space
call rax
# jumps to here - CoTaskMemFree section
mov rcx, [rsp + 0x30] # restore RCX to the parent object
movabs rax, 0x8877665544332211 # move placeholder for addrCoTaskMemFree into RAX
call rax
add rsp, 0x28 # deallocate shadow space
xor eax, eax
ret # return from function
since real assemblers don't preserve my relative jumps, they assume a linker step coming up...
FASM is the ideal tool for what you are coding
It does not require the use of a separate linker, and by default, when there is no format directive in the source file, the flat assembler simply puts generated instruction codes into the output, creating this way
a binary file. By default FASM generates 16-bit code, but you can turn it into 64-bit mode by using the use64 directive. All output code is always in the order in which it was entered into the source file.
You no longer need to hardcode the jump offsets. Just use regular labels. The only other change is replacing # by ; for the code comments.
Compiling from within the FASM IDE produces a binary file of 65 bytes.
SOURCE
use64
mov rax, [rcx + 0x8] ; gets the refcount
dec rax
mov [rcx + 0x8], rax
jz Release ; jump if refCount is zero to release section
ret ; early return from function
Release: ; IUnknown::Release of child object if present
mov [rsp + 0x08], rcx ; cache RCX
mov rcx, [rcx + 0x10] ; read the child object into RCX
sub rsp, 0x28 ; allocate shadow space
test rcx, rcx ; ensure the child object hasn't been set to Nothing
jz Nothing ; jump if child object is Nothing
mov rax, [rcx] ; get vtable pointer from [RCX]
mov rax, [rax + 0x10] ; get IUnknown::Release function pointer
call rax ; (3rd vtable entry = (3-1) * 0x08 = 0x10)
Nothing: ; CoTaskMemFree section
mov rcx, [rsp + 0x30] ; restore RCX to the parent object
mov rax, 0x8877665544332211 ; move placeholder for addrCoTaskMemFree into RAX
call rax
add rsp, 0x28 ; deallocate shadow space
xor eax, eax
ret ; return from function
LISTING
But FASM also comes with a listing tool that you run from a CMD prompt:
C:\FASMW>fasm inject.asm inject.bin -s inject.fas
C:\FASMW>listing inject.fas inject.lst
The offset addresses make it very easy to locate the placeholder for addrCoTaskMemFree.
use64
00000000: 48 8B 41 08 mov rax, [rcx + 0x8] ; gets the refcount
00000004: 48 FF C8 dec rax
00000007: 48 89 41 08 mov [rcx + 0x8], rax
0000000B: 74 01 jz Release ; jump if refCount is zero to release section
0000000D: C3 ret ; early return from function
Release: ; IUnknown::Release of child object if present
0000000E: 48 89 4C 24 08 mov [rsp + 0x08], rcx ; cache RCX
00000013: 48 8B 49 10 mov rcx, [rcx + 0x10] ; read the child object into RCX
00000017: 48 83 EC 28 sub rsp, 0x28 ; allocate shadow space
0000001B: 48 85 C9 test rcx, rcx ; ensure the child object hasn't been set to Nothing
0000001E: 74 09 jz Nothing ; jump if child object is Nothing
00000020: 48 8B 01 mov rax, [rcx] ; get vtable pointer from [RCX]
00000023: 48 8B 40 10 mov rax, [rax + 0x10] ; get IUnknown::Release function pointer
00000027: FF D0 call rax ; (3rd vtable entry = (3-1) * 0x08 = 0x10)
Nothing: ; CoTaskMemFree section
00000029: 48 8B 4C 24 30 mov rcx, [rsp + 0x30] ; restore RCX to the parent object
0000002E: 48 B8 11 22 33 44 55 66 77 88 mov rax, 0x8877665544332211 ; move placeholder for addrCoTaskMemFree into RAX
00000038: FF D0 call rax
0000003A: 48 83 C4 28 add rsp, 0x28 ; deallocate shadow space
0000003E: 31 C0 xor eax, eax
00000040: C3 ret ; return from function