Will's suggestion to do this with cases does work. Crucially though, this case splitting depends on the entire set, which means that this doesn't really look like a typical set-theoretic ordered pair/tuple coding which is defined in an element-by-element way (some of which are actually natural transformations from some iterated covariant power set functor, as I discussed in this answer).
Let $\langle x, y \rangle$ be any standard coding of ordered pairs (such as any of the ones in your question). We'll take a function to be a set of elements of the form $\langle x , y \rangle$. Recall that one can code natural numbers as von Neumann ordinals: $n = \{0,1,2,\dots,n-1\}$.
Given two functions $f$ and $g$ with domains $n$ and $m$, respectively, the concatenation of $f$ and $g$, written $f\frown g$, is the unique function $h$ with domain $n + m$ satisfying that $h(i) = f(i)$ for $i < n$ and $h(i) = g(i-n)$ for $i \geq n$. It's easy to show that concatenation of functions with natural number domains is associative (i.e., satisfies that $f \frown (g \frown h) = (f \frown g) \frown h$).
For any $n \geq 0$, say that a set $Y$ is an $n$-fold product if there are sets $X_0,\dots,X_{m-1}$ such that $Y$ is the set of all functions $f$ with domain $n$ satisfying that $f(i) \in X_i$ for each $i < n$. (Note that the only $0$-fold product is $1 = \{0\}$.) It is an easy exercise to show that any given non-empty set is an $n$-fold product for at most one $n$ and one sequence $(X_i)_{i < n}$ (but this requires the set to be non-empty).
Say that a set $Y$ is a raw set if it is not an $n$-fold product for any $n$. (In particular, a raw set is always non-empty.) Given any set $X$, let $\iota(X) = \{\langle 0, x \rangle : x \in X\}$. (In other words, $\iota$ is the set of functions from $1 =\{0\}$ to $X$.) Note that $\iota(X)$ is always a $1$-fold product.
Now we can define a strictly associative product by cases:
- If $X$ and $Y$ are $n$- and $m$-fold products respectively, then $X \times Y = \{f \frown g : f \in X,~g \in Y\}$.
- If $X$ is a raw set and $Y$ is an $m$-fold product, then $X \times Y = \iota(X) \times Y$ in the above sense.
- If $X$ is an $n$-fold product and $Y$ is a raw set, then $X \times Y = X \times \iota(Y)$ in the above sense.
- If $X$ and $Y$ are both raw sets, then $X \times Y = \iota(X) \times \iota(Y)$.
Now, as mentioned by Will, in order to specify the desired functor from $\mathrm{Set}\times \mathrm{Set}$ to $\mathrm{Set}$, it's enough to specify, for each pair of sets $X$ and $Y$, projection maps $\pi_0 : X \times Y \to X$ and $\pi_1 : X \times Y \to Y$. This can be done in an analogous case-by-case basis.
Finally, it's straightforward to prove that with this definition we have $X \times (Y \times Z) = (X\times Y) \times Z$ literally for all sets $X$, $Y$, and $Z$.
We also almost have a strict identity element. Specifically, $X \times 1 = 1 \times X = X$ if $X$ is an $n$-fold product, but $X \times 1 = 1 \times X = \iota(X)$ for a raw set $X$. I believe it should be possible (with more cases) to make the identity element strict.
As I already said, this is a bit unsatisfying because it isn't given by a simple element-by-element formula. I'm not completely sure how to formalize this question, but I suspect this isn't possible assuming the axiom of foundation (as TLo alluded to in the comments). Relatedly, I suspect that something like this isn't possible constructively (as addressed in Naïm's comment).
On some level the issue is with the way the desired behavior interacts with unions. What should $X \times ((Y \times Z) \cup W)$ look like? Note that the union of an $n$-fold product and an $m$-fold product is generally a raw set.