1

I fit a model using Keras sequential with LSTM layers. The LSTM model will have certain aspects of the state carry forward from one period t to the next, besides the the input sequence of values for that period.

I fit the model on data ending Dec 31, 2025. Now I want to make a forecast starting today January 21, 2026. On the first period of the forecast, where does .predict() get the value of the hidden state? Is this initialized at a fixed value? Or is it based on the model fit last actual (Dec 31 as if the forecast were starting Jan 1st)?

While this question is conceptual, I include an example program in case it matters.

model1 = Sequential([
    LSTM(128, input_shape=(ts, X1.shape[2]),dropout=0, recurrent_dropout=0, return_sequences=True),  # Default activations
    LSTM(64, dropout=do, recurrent_dropout=0),  # Default activations
    Dense(1)  # Output layer
    ])
    
model1.compile(optimizer='adam', loss='mean_squared_error', metrics=['mean_absolute_error'])

history = model1.fit(
    X1, y1, 
    epochs=ep, 
    batch_size=32, 
    verbose=verb)
1
  • 1
    with your current model, .predict() does not carry over the hidden state from Dec 31. The LSTM state is reset to its default initial state (zeros) at the start of prediction. Commented Jan 21 at 16:58

2 Answers 2

2

In your model, you did not set stateful=True. So, this is a stateless LSTM which is default in Keras.

LSTM(128, input_shape=(ts, X1.shape[2]), dropout=0, recurrent_dropout=0, return_sequences=True),
LSTM(64, dropout=0, recurrent_dropout=0),

As the LSTM model is stateless, so the hidden state does not persist across batches or epochs. Again, each call starts with h_0=0, c_o=0.

So, when you call predict function of LSTM model, the LSTM does not remember Dec 31, 2025, even though it was the last training date.

model1.predict(X_future)

If you want to forecast Jan 21 → Jan 30 recursively:

  1. Predict Jan 21 using last real window
  2. Append prediction to window
  3. Predict Jan 22
  4. Repeat

This is teacher-forcing vs recursive forecasting, not hidden-state persistence.

Sign up to request clarification or add additional context in comments.

1 Comment

thanks So I gather based on this that if you iterate with model.predict() in a loop, the state is not retained from iteration to iteration by default. So for i=1, you start with the hidden_state=0 and i=2 then hidden_state=0 gain. However, if you specify 'stateful=True' in the LSTM layer, then the input_state will be retained until you say model.reset_states()?
1

On the first period of the forecast, where does .predict() get the value of the hidden state? Is this initialized at a fixed value? Or is it based on the model fit last actual (Dec 31 as if the forecast were starting Jan 1st)?

Initialized to a fixed value, specifically zero.

You can see this by looking at this the code for initializing the hidden and cell state in LSTMCell.get_initial_state().

    def get_initial_state(self, batch_size=None):
        return [
            ops.zeros((batch_size, d), dtype=self.compute_dtype)
            for d in self.state_size
        ]

https://github.com/keras-team/keras/blob/ad27d9ccefa97ec0496ec3ffaa1df5d731acd636/keras/src/layers/rnn/lstm.py#L324-328

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.