How to Enahnce the predictions of LSTM model [closed]

Ask Question

Asked 2 days ago

Modified yesterday

Viewed 71 times

-4

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

This question does not appear to be about programming within the scope defined in the help center.

Closed 13 hours ago.

Improve this question

I'm working on small project using the modern portfolio theory for portfolio optimization, the final required output is a recommendations on which assets to buy and how much to buy.

to achive that goal, I use MTP with Black–Litterman model, which are non-AI and use deterministic logic, but for the Black–Litterman model, there is input called 'views', as the name suggest, they are views or opinions or PREDICTIONS about the real assets returns, I generate those 'views' using LSTM model , I trained it on data of shape (1518, 484) (i.e., 1518 samples, and 484 different equities).

here is a reproducable code to predict the returns of some assets:

# Source - https://stackoverflow.com/q/79933560
# Posted by abd klaib, modified by community. See post 'Timeline' for change history
# Retrieved 2026-04-29, License - CC BY-SA 4.0

# Necessary libraries.
import yfinance as yf
import numpy as np
import pandas as pd
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from tensorflow import keras
import matplotlib.pyplot as plt
import pandas as pd
from sklearn import preprocessing
from numpy.typing import NDArray
from typing import Any
# Getting Stored data.
equities=['PNC','MDLZ','FAST','FCX','ESI','RGLD','TSLA','PCAR','C','IBM','PPC','CVS','AZO','MTCH','PFGC','PR','VNOM','MAA'] # can add other equities; up to 484.
x=yf.download(equities,start="2020-04-09",auto_adjust=True,threads=True,interval="1d",group_by="ticker")
df=x.xs("Close",level=1,axis=1).pct_change().dropna()
df.to_parquet("latest_close_returns.parquet")


def create_sequences(data:pd.DataFrame, window:int)->tuple[NDArray[Any], NDArray[Any]]:

    """

    Description: this function to be used for  constructing labels for time series data.

    Args:

        data (pd.DataFrame): Unlabeled time series data.

        window (int): Number of past time steps (days) used as input to the model to predict the next time step. For example, a window of 50 means the model uses the previous 50 days of data to forecast day 51.
    
    Returns:

        X: Three dimensional data (for example: (1468, 50, 484), where 1468 is number of samples, 50 is the window, and 484 is the number of features).
        
        y: Labels that are Two dimensional .(for example: (1468,484), where 1468 is number of samples, and 484 is the number of features, effectively having label for each training sample).
    
"""
    
    X, y = [], []
    for i in range(len(data) - window): 
        X.append(data.iloc[i:i + window])
        y.append(data.iloc[i + window,:])
    return np.array(X), np.array(y)



# Constructing labels for the Unlabeled time series data.
X, y = create_sequences(df, 50) # play with window

# Splitting data before and preprocessing to prevent data leakage.
split = int(0.85 * len(X))
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]


# Preprocessing here is just Scaling, as there is no missing data at all. 
# Scaling needs 2D data, not 3D data. After scaling is done, reshape again to    3D, as training need 3D data.
X_train_2D = X_train.reshape(-1, X_train.shape[2])
X_test_2D = X_test.reshape(-1, X_test.shape[2])
scaler=preprocessing.StandardScaler()
# fitting the scaler only on the training data.
X_train_scaled = scaler.fit_transform(X_train_2D).reshape(X_train.shape) 
y_train_scaled = scaler.transform(y_train)
X_test_scaled = scaler.transform(X_test_2D).reshape(X_test.shape)
y_test_scaled =scaler.transform(y_test)


#Training the model
callback=keras.callbacks.EarlyStopping(
    monitor="val_loss",
    min_delta=0,
    patience=10,
    verbose=0,
    mode="min",
    baseline=None,
    restore_best_weights=True,
    start_from_epoch=0,
)
model = keras.Sequential()
model.add(keras.layers.LSTM(64, input_shape=(
    X_train_scaled.shape[1], X_train_scaled.shape[2])))
model.add(keras.layers.Dropout(0.3))
model.add(keras.layers.Dense(len(df.columns)))
model.compile(loss="mse", optimizer="adam", metrics=["mae"])
model.summary()
model.fit(X_train_scaled, y_train_scaled, epochs=150,callbacks=[callback])

# Predicting; Predcitions result must be unscaled.
pred_scaled=model.predict(X_test_scaled)
pred = scaler.inverse_transform(pred_scaled)    
y_test=scaler.inverse_transform(y_test_scaled) 

# Performance metrics 
mse = mean_squared_error(y_test, pred)
mae = mean_absolute_error(y_test, pred)
r2 = r2_score(y_test, pred)
print(f"mse is :{mse}\n")
print(f"mae is {mae}\n")
print(f"r2 is {r2}")

The overall results acroos all equities are so bad :

mse is :0.0005406439183876653

mae is 0.01578705713631846

r2 is -0.20972200159793528

I uploaded visualization of the predictions results for the first feature

Questions: “Why is R² negative?” “Is my scaling logically correct?” “Is my LSTM setup valid for multivariate returns?”

edited yesterday

asked 2 days ago

abd klaib

13 bronze badges

2

This question is similar to: How to make good reproducible pandas examples. If you believe it’s different, please edit the question, make it clear how it’s different and/or how the answers on that question are not helpful for your problem.

Tuhin Shaikh
– Tuhin Shaikh

2026-04-29 10:07:28 +00:00
Commented 2 days ago
by the way: there are similar sites for ML/AI: CrossValidated, Artificial Intelligence

furas
– furas

2026-04-29 11:52:54 +00:00
Commented yesterday
1

I wonder why you have string # Source - https://stackoverflow.com/q/79933560 (and other comments) in your code. It looks like you copied your own code using button Copy and Stackoverflow added information about source.

furas
– furas

2026-04-29 11:57:08 +00:00
Commented yesterday
yes I did the Copy button XD , Idid not want to write the question from scratch again and again

abd klaib
– abd klaib

2026-04-29 12:05:57 +00:00
Commented yesterday

Add a comment |

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

How to Enahnce the predictions of LSTM model [closed]

0

Linked

Hot Network Questions