I'm reading the paper Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem. In section 2.4, they write:
We can symmetrize an algorithm by composing it with its adjoint.
They illustrate this on the "symmetrized Langevin algorithm". Slightly generalizing, say we are dealing with $${\rm d}X_t=b(X_t){\rm d}t+\sigma{\rm d}W_t\tag1,$$ where $b:\mathbb R^d\to\mathbb R^d$ and $\sigma\in\mathbb R^{d\times d}$. Then they arrive at: $$M_{n+1}=\left(I_d+\Delta tb\right)^{-1}\left(M_{n-1}-\Delta tb(M_{n-1})+\sqrt{\Delta t}\sigma Z_n\right)\tag2.$$
I really got trouble to understand how this is a composition of ordinary Euler-Maruyama, which is $$M_n=M_{n-1}+\Delta tb(M_{n-1})+\sqrt{\Delta t}\sigma Z_n\tag3,$$ and its "adjoint". The transition kernel of $(3)$ is given by $$\kappa(x,\;\cdot\;):=\mathcal N(x+\Delta tb(x),\Delta t\Sigma)\tag4.$$ So, in my understanding, the adjoint is $$\kappa^\ast(y,B):=\int_B\varphi_{\Delta t\Sigma}(x+\Delta tb(x)-y)\;{\rm d}x,\tag5$$ where $\varphi_\Sigma$ denotes the density of $\mathcal N(0,\Sigma)$ wrt the $d$-dimensional Lebesgue measure. At this point, I don't see how we could sample from $\kappa^\ast(y,\;\cdot\;)$ and so I don't see how $(2)$ is a composition of first sampling $Y\sim\kappa(M_{n-1},\;\cdot\;)$ and then $M_n\sim\kappa^\ast(Y,\;\cdot\;)$, which is how I understand their claim.
However, if we consider the (semi-)implicit (or "drift-implicit", however you like to call it) Euler-Maruyama method $$M_n=\left(I_d-\Delta tb\right)^{-1}\left(M_{n-1}+\sqrt{\Delta t}\sigma Z_n\right)\tag6,$$ I clearly see a similarity to $(2)$, which makes me wonder whether $\kappa^\ast$ is actually the transition kernel of $(6)$?
As a final remark, clearly, the inverse has to be understood in some kind of generalized sense. In case it isn't clear, my question is: How do we put these pieces together? How is $(2)$ the composition with the "adjoint algorithm" and what is meant by that at all?