Questions tagged [variational-bayes]
Variational Bayesian methods approximate intractable integrals found in Bayesian inference and machine learning. Primarily, these methods serve one of two purposes: Approximating the posterior distribution, or bounding the marginal likelihood of observed data.
399 questions
1
vote
1
answer
58
views
Bayes-by-backprop - meaning of partial derivative
The Google Deepmind paper "Weight Uncertainty in Neural Networks" features the following algorithm:
Note that the $\frac{∂f(w,θ)}{∂w}$
term of the gradients for the mean and
standard ...
3
votes
1
answer
129
views
Bayesian Clustering with a Finite Gaussian Mixture Model with Missing Data
I would like to perform clustering with a finite Gaussian Mixture model, however, I have missing data (some features are missing at random). I am using Variational Inference to fit my Bayesian GMM. Is ...
3
votes
1
answer
132
views
Simple question about VAEs
I have trouble understanding the minimization of the KL divergence.
In this link https://www.ibm.com/think/topics/variational-autoencoder
They say, "One obstacle to using KL divergence for ...
2
votes
0
answers
130
views
Found negative Posterior Conditional Variance when applying Variational Inference to State-space model. Why?
This paper proposes a computationally efficient parametrization of joint gaussian approximate posteriors.
I am trying to apply the methods in it to the state-space model below:
$$
x_0 \sim \delta(x_0)\...
5
votes
2
answers
329
views
Backpropagating regularization term in variational autoencoders
Setup
The variational autoencoder (VAE) loss is given by the following (see here, for example):
$$L = - \sum_{j = 1}^J \frac{1}{2} \left(1 + \log (\sigma_i^2) - \sigma_i^2 - \mu_i^2 \right) - \frac{1}{...
1
vote
0
answers
81
views
Why can we assume Likelihood and Prior, but not Evidence?
I’ve been studying Variational Inference, and there’s a part I’ve been struggling to understand for the past few days, so I decided to write this post.
As you can see from the title, my question is: &...
1
vote
1
answer
113
views
Linear VAE and pPCA
I am looking into the relationship between linear Variational Autoencoder (VAE) and probabilistic PCA (pPCA) presented by Lucas et al. (2019). Don't blame the elbo! paper
In the official ...
0
votes
0
answers
87
views
How to calculate the KL divergence between two multivariate complex Gaussian distributions?
I am reading a paper "Complex-Valued Variational Autoencoder: A Novel Deep Generative Model for Direct Representation of Complex Spectra"
In this paper, the author calculate the KL ...
2
votes
1
answer
129
views
Why do VAEs work?
I am currently reading into Variational Autoencoders, and although I kind of understand the mathematical background described in the original paper (Auto-encoding Variational Bayes), I am struggling ...
3
votes
1
answer
144
views
Variational autoencoders - handcrafted example
In learning about variational autoencoders (VAEs), I would like to come up with a nice little handcrafted example to help make sense of them thoroughly. For this, suppose I know that my samples are ...
2
votes
2
answers
176
views
VAEs - Two questions regarding the posterior and prior distribution derivations
I'm deeply failing to understand the first step in the ELBO derivation in VAEs.
When asking my questions I'll also try to clearly state my assumptions and perhaps some of them are wrong to begin with:
...
1
vote
0
answers
92
views
How to speed up the following ELBO evaluation?
I have an estimation problem where I need to maximize the evidence lower bound:
$$ \mathrm{ELBO} = -\frac{1}{2} \Bigg( \mathbb{E}_{q(\theta)} \left[ \mathrm{vec}(\mathbf{Z})^{\mathrm{H}} \mathbf{C}^{-...
2
votes
1
answer
310
views
Why do we say that we're "predicting" the mean/noise in diffusion models?
In DDPM, ${\tilde\mu}_t$ is the mean of the conditional distribution $q(x_{t-1}|x_t,x_0)$ while the neural network $\mu_\theta$ is modeling a different conditional distribution $p_\theta(x_{t-1}|x_t)$....
2
votes
1
answer
150
views
Why is the forward process referred to as the "ground truth" in diffusion models?
I've seen in many tutorials on diffusion models refer to the distribution of the latent variables induced by the forward process as "ground truth". I wonder why. What we can actually see is ...
9
votes
2
answers
640
views
Theoretical justification for minimizing $KL(q_\phi|p)$ rather than $KL(p|q_\phi)$?
Suppose we have a true but unknown distribution $p$ over some discrete set (i.e. assume no structure or domain knowledge), and a parameterized family of distributions $q_\phi$.
In general it makes ...