One procedure that works here is to average your function. This is generalized by the notion of convolution. Let $f \in C[0,1]$ and extend $f$ continuously to $\mathbb{R}$ by setting
$$ \tilde{f}(x) = \begin{cases} f(0) & -\infty < x \leq 0, \\
f(x) & 0 \leq x \leq 1, \\
f(1) & 1 < x < +\infty.
\end{cases} $$
Given $\delta > 0$, we define $f_{\delta} \colon [0,1] \rightarrow \mathbb{R}$ by
$$ f_{\delta}(x) = \frac{1}{2\delta} \int_{x - \delta}^{x + \delta} \tilde{f}(t) \, dt.$$
where the integration is done using the standard Lebesgue measure. Then we have the following properties:
- By the fundamental theorem of calculus, the function $f_{\delta}$ is continuously differentiable with derivative
$$ f_{\delta}'(x) = \begin{cases} \frac{f(x+\delta) - f(0)}{2\delta} & 0 \leq x \leq \delta, \\
\frac{f(x + \delta) - f(x - \delta)}{2\delta} & \delta \leq x \leq 1 - \delta, \\
\frac{f(1) - f(x - \delta)}{2\delta} & 1 - b \leq x \leq 1. \end{cases}$$
- The estimate
$$ \left| f(x) - f_{\delta}(x) \right| = \left| \frac{1}{2\delta} \int_{x - \delta}^{x + \delta} (\tilde{f}(t) - \tilde{f}(x)) \, dt\right| \leq |f(t) - f(x)| $$
and the uniform continuity of $f$ shows that $f_{\delta} \to f$ uniformly on $[0,1]$ as $\delta \to 0$. In particular, $f_{\delta} \to f$ in $L^2([0,1],dx)$.
- If $f$ is left and right differentiable at $x \in (0,1)$ (the derivatives don't have to agree) then
$$ f_{\delta}'(x) \rightarrow \frac{f'_{+}(x) + f'_{-}(x)}{2}. $$
Similarly if $x = 0$ or $x = 1$.
- If $\tilde{f}$ is linear on $[x - \delta, x + \delta]$ then $f_{\delta}(x) = f(x)$. That is, our smoothening procedure doesn't change linear functions. If $f$ is piecewise linear on $0 = x_0 < x_1 < \dots < x_n = 1$ (that is, $f$ is linear on each $[x_i,x_{i+1}]$ and $\delta$) is small enough, then $f_{\delta} = f$ outside neighborhoods of the form $[x_i - \delta, x_i + \delta]$ and on such neighborhoods, $f_{\delta}$ replaces $f$ by a quadratic function that patches up the linear pieces.
How can we apply the procedure above to our problem?
- First, by plugging $f(x) = x$ into the inequality we see that $\mu$ is finite.
- Let $0 \leq a < b \leq 1$ and $h > 0$ small enough. Consider the piecewise linear function
$$ f(x) = \begin{cases}
0 & 0 \leq x \leq a, \\
x - a & a \leq x \leq b, \\
\frac{a-b}{h}(x - (b + h)) & b \leq x \leq b + h, \\
0 & b + h \leq x \leq 1.
\end{cases} $$
By plugging $f_{\delta}$ into the inequality, we get
$$ \left| \int_0^1 f_{\delta}'(x) \, d\mu(x) \right| \leq \left( \int_0^1 f_{\delta}^2(x) \, dx \right)^{\frac{1}{2}}. $$
By $(2)$, the right hand side converges to $\left( \int_0^1 f^2(x) \, dx \right)^{\frac{1}{2}}$. By $(3)$, we see that $f_{\delta}'(x)$ converges pointwise to
$$ \lim_{\delta \to 0} f_{\delta}'(x) = \begin{cases}
0 & 0 \leq x < a, x = b, x > b + h \\
\frac{1}{2} & x = a, \\
1 & a < x < b, \\
-1 & b < x < b + h, \\
-\frac{1}{2} & x = b + h, \\
\end{cases} $$
and so by the dominated convergence theorem (for $([0,1], \mathcal{B}, \mu)$) the left hand side converges to $\int_0^1 \left( \lim_{\delta \to 0} f_{\delta}'(x) \right) \, dx$ and we get
$$ \left| \frac{1}{2} \mu \left( \{ a \} \right) + \mu((a,b)) - \mu((b,b+h)) -\frac{1}{2} \mu \left( \{ b + h \} \right) \right| = \\
\left| \frac{1}{2} \mu \left( \{ a \} \right) + \mu((a,b)) - \frac{1}{2} \mu((b,b+h]) - \frac{1}{2}\mu((b,b+h)) \right| \leq \left( \frac{(b-a)^3}{3} + \frac{(b-a)^2}{3}h\right)^{\frac{1}{2}}.$$
Taking $h \to 0$ and using the fact that $\mu$ is finite we get
$$ \left| \frac{1}{2} \mu \left( \{ a \} \right) + \mu((a,b)) \right| \leq \left( \frac{(b-a)^3}{3} \right)^{\frac{1}{2}}. $$
Taking $b \to a$ we get that $\mu \left( \{ a \} \right) = 0$ (so $\mu$ is atomless) and finally
$$ \left| \mu([a,b]) \right| = \left|\mu((a,b)) \right| \leq \left( \frac{(b-a)^3}{3} \right)^{\frac{1}{2}}. $$
- Given a finite collection of disjoint subintervals $[a_i, b_i]$ of $[0,1]$, we can construct a piecewise linear $f$ whose graph on each $[a_i, b_i + h]$ is the same as we have considered before, apply the same arguments, and obtain
$$ \left| \sum_{i} \mu((a_i, b_i)) \right| \leq \left( \sum_i \frac{(b_i - a_i)^3}{3} \right)^{\frac{1}{2}} $$
which shows that $\mu$ is absolutely continuous with respect to the Lebesgue measure.
- Denote by $g$ the Radon-Nikodym derivative $g(x) = \frac{d\mu}{dx}$. If $0 < a < b < 1$ are Lebesgue points of $g$ and $h> 0$ is small enough, we have
$$ \left| g(a) - g(b) \right| \leq \frac{1}{2h} \left| \int_{a - h}^{a + h} (g(a) - g(z)) \, dz \right| + \frac{1}{2h} \left| \int_{a - h}^{a + h} g(z) \, dz - \int_{b - h}^{b + h} g(z) \, dz \right| + \frac{1}{2h} \left| \int_{b - h}^{b + h} (g(z) - g(b)) \, dz \right|. $$
The first and the third terms go to zero as $h \to 0$ since $a,b$ are Lebesgue points while for the middle term we have
$$ \frac{1}{2h} \left| \int_{a - h}^{a + h} g(z) \, dz - \int_{b - h}^{b + h} g(z) \, dz \right| = \frac{1}{2h} \left| \int_{[0,1]} (\chi_{[a-h,a+h]}(z) - \chi_{[b - h,b + h]}(z)) g(z) \, dz \right| \\
= \frac{1}{2h} \left| \int_{[0,1]} (\chi_{[a-h,a+h]}(z) - \chi_{[a - h,a + h]}(z)) d\mu(z) \right| = \frac{1}{2h} \left| \mu([a - h, a + h] - \mu([b - h, b + h]) \right|. $$
To estimate the last term, consider the trapezoid function
$$ f(x) = \begin{cases}
0 & 0 \leq x \leq a - h, \\
x - (a - h) & a - h \leq x \leq a + h, \\
2h & a + h \leq x \leq b - h, \\
-(x - (b + h)) & b - h \leq x \leq b + h, \\
0 & b + h \leq x \leq 1.
\end{cases} $$
Replacing $f$ by $f_{\delta}$ and arguing as before, we get the inequality
$$ |\mu([a - h, a + h]) - \mu([b - h, b + h])| \leq \sqrt{\frac{16h^3}{3} + 4h^2(b - a - 2h)}. $$
Dividing by $2h$, we get
$$\frac{1}{2h}|\mu([a - h, a + h]) - \mu([b - h, b + h])| \leq \sqrt{\frac{4h}{3} + (b - a - 2h)}. $$
Taking $h \to 0$ and combining everything, we finally get
$$ |g(a) - g(b)| \leq \sqrt{b - a} = \sqrt{|a - b|}. $$