I'm reading about Pattern recognition and when I read the appendix on my book I came across with the following derivation:
$J(\theta)$ is cost a function with parameter $\theta = (\theta_1, ..., \theta_d)$. If $J(\theta) = c$ then:
$$dc = 0 = \frac{\partial J(\theta)^T}{\partial \theta}d\theta \Rightarrow \frac{J(\theta)}{\partial \theta} \perp d\theta$$
This maybe a very easy question, but the derivation above confuses me...could someone write it out more explicitly what the author did :) I attached a picture showing more information taken from the book. I have highlighted the area I found confusing...what happened in the red area?

Thnx for guidance :)