LSTM time series using Keras Sequential, .predict() explanation

Question

I fit a model using Keras sequential with LSTM layers. The LSTM model will have certain aspects of the state carry forward from one period t to the next, besides the the input sequence of values for that period.

I fit the model on data ending Dec 31, 2025. Now I want to make a forecast starting today January 21, 2026. On the first period of the forecast, where does .predict() get the value of the hidden state? Is this initialized at a fixed value? Or is it based on the model fit last actual (Dec 31 as if the forecast were starting Jan 1st)?

While this question is conceptual, I include an example program in case it matters.

model1 = Sequential([
    LSTM(128, input_shape=(ts, X1.shape[2]),dropout=0, recurrent_dropout=0, return_sequences=True),  # Default activations
    LSTM(64, dropout=do, recurrent_dropout=0),  # Default activations
    Dense(1)  # Output layer
    ])
    
model1.compile(optimizer='adam', loss='mean_squared_error', metrics=['mean_absolute_error'])

history = model1.fit(
    X1, y1, 
    epochs=ep, 
    batch_size=32, 
    verbose=verb)

with your current model, .predict() does not carry over the hidden state from Dec 31. The LSTM state is reset to its default initial state (zeros) at the start of prediction. — coderoid
– coderoid, Commented Jan 21 at 16:58

coderoid · Accepted Answer · 2026-01-29 05:17:00Z

2

In your model, you did not set stateful=True. So, this is a stateless LSTM which is default in Keras.

LSTM(128, input_shape=(ts, X1.shape[2]), dropout=0, recurrent_dropout=0, return_sequences=True),
LSTM(64, dropout=0, recurrent_dropout=0),

As the LSTM model is stateless, so the hidden state does not persist across batches or epochs. Again, each call starts with h_0=0, c_o=0.

So, when you call predict function of LSTM model, the LSTM does not remember Dec 31, 2025, even though it was the last training date.

model1.predict(X_future)

If you want to forecast Jan 21 → Jan 30 recursively:

Predict Jan 21 using last real window
Append prediction to window
Predict Jan 22
Repeat

This is teacher-forcing vs recursive forecasting, not hidden-state persistence.

edited Jan 29 at 5:17

answered Jan 21 at 17:06

coderoid

2,4006 gold badges42 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

dshaffe34 Jan 29 at 2:36

thanks So I gather based on this that if you iterate with model.predict() in a loop, the state is not retained from iteration to iteration by default. So for i=1, you start with the hidden_state=0 and i=2 then hidden_state=0 gain. However, if you specify 'stateful=True' in the LSTM layer, then the input_state will be retained until you say model.reset_states()?

Nick ODell · Accepted Answer · 2026-01-21 17:10:37Z

On the first period of the forecast, where does .predict() get the value of the hidden state? Is this initialized at a fixed value? Or is it based on the model fit last actual (Dec 31 as if the forecast were starting Jan 1st)?

Initialized to a fixed value, specifically zero.

You can see this by looking at this the code for initializing the hidden and cell state in LSTMCell.get_initial_state().

    def get_initial_state(self, batch_size=None):
        return [
            ops.zeros((batch_size, d), dtype=self.compute_dtype)
            for d in self.state_size
        ]

https://github.com/keras-team/keras/blob/ad27d9ccefa97ec0496ec3ffaa1df5d731acd636/keras/src/layers/rnn/lstm.py#L324-328

Collectives™ on Stack Overflow

LSTM time series using Keras Sequential, .predict() explanation

2 Answers 2

1 Comment

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Related