Skip to main content

All Questions

2 votes
3 answers
126 views

How does Hydra `_partial_` interact with seeding

In the configuration management library Hydra, it is possible to only partially instantiate classes defined in configuration using the _partial_ keyword. The library explains that this results in a ...
Felix Benning's user avatar
-2 votes
1 answer
238 views

ML Model training - System Memory usage increasing over epoch

I am implementing a simple MLP using sagemaker particularly an aws ml.g4.xlarge machine and I noticed that ram memory keeps increasing over epoch. I am using pytorch.lightning for ML model ...
Manuel Ravasqueira's user avatar
0 votes
0 answers
54 views

Pruning error cannot assign 'torch.cuda.FloatTensor' to parameter

I am struggling to work out why I'm getting this error when trying to apply the lottery ticket hypothesis to my model. Clearly this is happening during the pruning callback, and it seems like it's ...
richbai90's user avatar
  • 5,204
0 votes
1 answer
37 views

Difference between accuracy during training and accuracy during testing

In the model below the accuracy reported at the end of the final validation stage is 0.46, but when reported during the manual testing the value is 0.53. What can account for this discrepancy? import ...
richbai90's user avatar
  • 5,204
0 votes
0 answers
24 views

self supervised model converging to a constant

I was trying to train a Barlow twins model for image classification. Nonetheless, I encountered a problem after finishing my model training. It seems that the model has become a constant always ...
youssef hafidi's user avatar
1 vote
0 answers
233 views

LSTM : predict_step in PyTorch Lightning

I've developed code for an LSTM model, but I'm uncertain about how to utilize it for predictions in a production environment. Could you please assist? In the provided predict.py script, I aim to ...
Ankush's user avatar
  • 25
0 votes
0 answers
446 views

UNet pytorch implementation is too slow

I have a tensorflow based simple UNet model used for optimization program. The program is something like an “untrained” neural net in the sense that once I load input data (experimental low res images)...
Ankur Singh's user avatar
0 votes
1 answer
1k views

PyTorch Dataloader - list indices must be integers or slices, not list

I have implemented a COCO dataset as follows: from torch.utils.data import Dataset from detr.datasets.coco import CocoDetection class MyCoco(CocoDetection): def __init__(self, ...
Dr. Prof. Patrick's user avatar
0 votes
1 answer
88 views

ValueError: Expected input batch_size (31) to match target batch_size (128)

The error occured in the code for a Spiking Neural Network. I am using the neuromorphic-MNIST dataset which is a spiking version of the original MNIST dataset. As libraries I am using tonic (for the ...
mrm.grwl's user avatar
2 votes
0 answers
691 views

is ML Flow Pytorch Lightning autologging supposed to automatically log metrics?

I am attempting to use MLflow to log a pytorch lightning model to experiments in Databricks. Accoring to the documentation, it seems like metrics like training and validation loss are supposed to be ...
Mikel's user avatar
  • 21
2 votes
0 answers
268 views

Pytorch Lightning - Display per class metrics (precision, recall, f1) in Train.Test(model, datamodule)

How would i go around adding per-class metrics inside the test_step methods from a Pytorch Lighting Module? I used self.f1_each = F1Score(task="multiclass", num_classes=self.num_classes,...
Eduard6421's user avatar
2 votes
1 answer
761 views

Unable to import the TextDataset class from the aitextgen library

I am trying to build a machine learning model that generates job descriptions by finetuning the gpt-neo model with my dataset of job descriptions I tried importing TextDataset class from aitextgen and ...
nkdtech's user avatar
  • 63
3 votes
0 answers
866 views

Pytorch Lightning manual GPU control

I have a training pipeline with two very big models, such that a single GPU can only fit one of them. I have access to two GPUs, and I decided to put each of the models on its own device. I have ...
Theo Lamort's user avatar
1 vote
2 answers
5k views

Why doesn't PyTorch Lightning module save logged val loss? ModelCheckpoint error

I'm running an LSTM-based model training on Kaggle. I use Pytorch Lightning and wandb logger for that. That's my model's class: class Model(pl.LightningModule): def __init__( self, ...
Karol's user avatar
  • 639
0 votes
1 answer
193 views

How to solve my problem of max_step parameter in pytorch?

I'm trying to train source code. class mymodel(pl.LightningModule): def __init__(self, config , learning_rate = 1e-4, max_steps = 100000//2): super(mymodel, self).__init__() self.config ...
diamond's user avatar
  • 11

15 30 50 per page