Skip to main content

All Questions

0 votes
0 answers
40 views

Suppose a hypernetwork $\mathcal{H}$ takes a latent variable $z \sim p_z(z)$, where $p_z$ is Gaussian, and outputs the parameters of another neural network $f$. In particular, each weight $w_i$ of $f$ ...
rando's user avatar
  • 360
0 votes
0 answers
19 views

I'm reading the Deep Learning book by Goodfellow, Bengio, and Courville (Chapter 8 section 8.7.1 on Batch Normalization, page 315). The authors use a simple example of a deep linear network without ...
spierenb's user avatar
1 vote
0 answers
24 views

I really want to play around with RNNs. Trying to build an AI assistant with RNNs to run on my machine as I'm always obsessed with RNNs model... To make the performance good, I think I need to do some ...
jupyter's user avatar
  • 111
0 votes
0 answers
9 views

What are other optimization tools that are powerful enough to improve the accuracy performance of the neural network model? Please give me recent tools that are powerful
bbadyalina's user avatar
0 votes
0 answers
17 views

I’m using an ensemble of M = 5 deep neural networks, each evaluated with T = 100 Monte Carlo dropout samples at test time to estimate predictive uncertainty. The model performs binary classification (...
Solomon123's user avatar
0 votes
0 answers
25 views

I’m working on a video classification task with a long-tailed dataset where a few classes have many samples while most classes have very few. More specifically, my dataset has around 9k samples and 3....
Olivia's user avatar
  • 191
1 vote
0 answers
26 views

The paper "Deep Quantile Regression: Mitigating the Curse of Dimensionality Through Composition" makes the following claim (top of page 4): It is clear that smoothness is not the right ...
Chris's user avatar
  • 322
2 votes
0 answers
21 views

In the paper "Deep Residual Learning for Image Recognition", it's been mentioned that "When deeper networks are able to start converging, a degradation problem has been exposed: with ...
Vignesh N's user avatar
0 votes
0 answers
24 views

I am new to GAIN (generative adversarial imputation network). I am trying to use GAIN to impute missing values. I have a quesiton about the values of the losses for the discriminator. Are the values ...
JonathonSoong's user avatar
0 votes
0 answers
40 views

A key element in Bayesian neural networks is finding the probability of a set of weights, so that it can be applied to Bayes rule. I cannot think of many ways of doing this, for P(w) (also sometimes ...
user494234's user avatar
1 vote
1 answer
58 views

The Google Deepmind paper "Weight Uncertainty in Neural Networks" features the following algorithm: Note that the $\frac{∂f(w,θ)}{∂w}$ term of the gradients for the mean and standard ...
user494234's user avatar
1 vote
1 answer
113 views

From the above, I am trying to derive the below: However, I do not see why the $q_\theta(w)$ has been omitted from $\log p(D)$, in equation 17 and 18. Here is my attempt to derive the above: $$\begin{...
user494234's user avatar
3 votes
0 answers
60 views

I am modelling the the sequence $\{(a_t,y_t)\}_t$ as follows: $$ \begin{cases} Y_{t+1} &= g_\nu(X_{t+1}) + \alpha V_{t+1}\\ X_{t+1} &= X_t + \mu_\xi(a_t) + \sigma_\psi(a_t)Z_{t+1}\\ X_0 &= ...
Uomond's user avatar
  • 51
0 votes
0 answers
65 views

Basically, the question above: in RL, people typically encode the state as a tensor consisting of a plane with "channels", i.e. original Alpha Zero paper. These channels are typically one-...
FriendlyLagrangian's user avatar
0 votes
0 answers
38 views

I am currently learning about flow matching models and wanted to test whether or not we could train a flow matching model on just two time steps 0 and 0.5 and sampling at only those two time steps to ...
Bill Wang's user avatar

15 30 50 per page
1
2 3 4 5
666