Skip to main content

Unanswered Questions

254 questions with no upvoted or accepted answers
6 votes
0 answers
250 views

Connect output node to next hidden node in RNN

I'm trying to build a neural network with an unconventional architecture and a having trouble figuring out how. Usually we have connections like so, where $X=$ input, $H=$ hidden layer, $Y=$ output ...
5 votes
1 answer
2k views

Difference Between Attention and Fully Connected Layers in Deep Learning

There have been several papers in the last few years on the so-called "Attention" mechanism in deep learning (e.g. 1 2). The concept seems to be that we want the neural network to focus on ...
5 votes
1 answer
192 views

LSTM Long Term Dependencies Keras

I am familiar with the LSTM unit (memory cell, forget gate, output gate etc) however I am struggling to see how this links to the LSTM implementation in Keras. In Keras the input data structure for X ...
3 votes
0 answers
58 views

Do i.i.d. assumptions extend to datasets of independently generated sequences in modern sequence models (e.g., RNNs)?

In standard machine learning settings with cross-sectional data, it's common to assume that data points are independently and identically distributed (i.i.d.) from some fixed data-generating process (...
3 votes
0 answers
92 views

How is the input gate in the LSTM learn?

How is the input gate neural network trained what to remember by propagating the error rate from predicting the next word in the language model? How does it help it to learn if it remembered the right ...
3 votes
1 answer
127 views

preprocessing time sequence

I have a long list of event (400 unique events, sequence ~10M long). I want to train an RNN to predict next event. The preprocessing steps i took are: (1) turning to OneHotEncoding using pandas: <...
3 votes
1 answer
1k views

should I shift a dataset to use it for Time series regression with RNN/LSTM?

I'm seeing this tutorial to know how to use LSTM to predict time series data and I noticed that he shifted the target/labels up so that the features are all in time t but the target is t+1 so my ...
3 votes
1 answer
116 views

Contextual Spell Correction

I want to create a spell checker that corrects the spelling mistakes contextually. For example, Erroneous sentence: I want to apply for credit cart Corrected sentence: I want to apply for credit ...
3 votes
0 answers
345 views

How do I implement masking in TensorFlow eager execution?

I am training a stateful RNN on variable length sequences (optional: see my previous question for more details). I padded the sequences to a fixed length with the value -1. The when batches are ...
3 votes
1 answer
3k views

One hot encoding as input to recurrent neural networks

I'm trying to predict next label in a pattern based on previous labels using recurrent neural network. In total I have 100 labels Example of input pattern: ...
3 votes
1 answer
855 views

Keras functional API Layer name not captured with TimeDistributed wrapper

...
3 votes
1 answer
5k views

Predicting next number in a sequence - data analysis

I am a machine learning newbie and I am working on a project where I'm given a sequence of integers all of which are in the range 0 to 70. My goal is to predict the next integer in the sequence given ...
3 votes
0 answers
122 views

Encoder-Decoder Sequence-to-Sequence Model for Translations in Both Directions

Is it possible to use a pre-trained sequence to sequence encoder-decoder model which translates an input text in source language to an output in target language to do an inverse translation? That is, ...
3 votes
0 answers
457 views

For stateful LSTM, does sequence length matter?

With stateful LSTM the entire state is retained between both the sequences in the batch that is submitted, and even between separate batches until ...
3 votes
0 answers
489 views

Neural Network Prediction regression task, output is a multiple factor of input with same peaks

When I missed some details please point this out. I made a simple sequential LSTM model for regression. The model loss is 3.2145e-06. The data is scaled between 0 and 1. I tried different variations ...

15 30 50 per page
1
2 3 4 5
17