Skip to main content
487 votes
14 answers
285k views

What is the difference between epoch and iteration when training a multi-layer perceptron?
mohammad's user avatar
  • 5,025
470 votes
10 answers
321k views

In the following TensorFlow function, we must feed the activation of artificial neurons in the final layer. That I understand. But I don't understand why it is called logits? Isn't that a mathematical ...
Milad P.'s user avatar
  • 5,107
446 votes
16 answers
448k views

What is the difference between 'SAME' and 'VALID' padding in tf.nn.max_pool of tensorflow? In my opinion, 'VALID' means there will be no zero padding outside the edges when we do max pool. ...
karl_TUM's user avatar
  • 5,929
435 votes
3 answers
382k views

For any Keras layer (Layer class), can someone explain how to understand the difference between input_shape, units, dim, etc.? For example the doc says units specify the output shape of a layer. ...
scarecrow's user avatar
  • 6,874
409 votes
4 answers
89k views

While trying to reconcile my understanding of LSTMs pointed out here in this post by Christopher Olah implemented in Keras and following the blog written by Jason Brownlee for the Keras tutorial, I am ...
sachinruk's user avatar
  • 10k
401 votes
11 answers
469k views

How do I save a trained model in PyTorch? I have read that: torch.save()/torch.load() is for saving/loading a serializable object. model.state_dict()/model.load_state_dict() is for saving/loading ...
Wasi Ahmad's user avatar
  • 38.1k
387 votes
9 answers
324k views

Why does zero_grad() need to be called during training? | zero_grad(self) | Sets gradients of all model parameters to zero.
user1424739's user avatar
  • 14.2k
358 votes
11 answers
503k views

How do I print the summary of a model in PyTorch like what model.summary() does in Keras: Model Summary: ...
Wasi Ahmad's user avatar
  • 38.1k
303 votes
6 answers
369k views

When should I use .eval()? I understand it is supposed to allow me to "evaluate my model". How do I turn it back off for training? Example training code using .eval().
Gulzar's user avatar
  • 29k
282 votes
11 answers
527k views

How do I initialize weights and biases of a network (via e.g. He or Xavier initialization)?
Fábio Perez's user avatar
  • 26.7k
279 votes
3 answers
309k views

When I trained my neural network with Theano or Tensorflow, they will report a variable called "loss" per epoch. How should I interpret this variable? Higher loss is better or worse, or what does it ...
mamatv's user avatar
  • 3,661
241 votes
6 answers
278k views

Does it call forward() in nn.Module? I thought when we call the model, forward method is being used. Why do we need to specify train()?
aerin's user avatar
  • 23.1k
235 votes
13 answers
327k views

I have trained a binary classification model with CNN, and here is my code model = Sequential() model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode=...
GoingMyWay's user avatar
  • 17.6k
221 votes
12 answers
257k views

I'm trying to train a CNN to categorize text by topic. When I use binary cross-entropy I get ~80% accuracy, with categorical cross-entropy I get ~50% accuracy. I don't understand why this is. It's a ...
Daniel Messias's user avatar
214 votes
10 answers
190k views

In most of the models, there is a steps parameter indicating the number of steps to run over data. But yet I see in most practical usage, we also execute the fit function N epochs. What is the ...
Yang's user avatar
  • 6,972

15 30 50 per page
1
2 3 4 5
1812