Questions tagged [deep-learning]

Ask Question

a new area of Machine Learning research concerned with the technologies used for learning hierarchical representations of data, mainly done with deep neural networks (i.e. networks with two or more hidden layers), but also with some sort of Probabilistic Graphical Models.

4,782 questions

3 votes

0 answers

30 views

Model's forecasts are not anchored correctly to the history

I'm implementing this paper and trying to train it on a generated data and return full ground truths and a single forecast but the forecasts my model is producing are not anchored to the past series ...

cdt123

asked Feb 24 at 23:08

2 votes

1 answer

32 views

Why my Mamba failed on IEEG time series data?

I tried to implement Mamba on IEEG time series data where brain waves at different regions are identified with 800 data points for 4 seconds. I have 102 similar samples of a person with data from 5 ...

Sriram M

asked Feb 24 at 11:12

5 votes

0 answers

28 views

What are the difference between LLMs using a 'think mode' and using a 'non-think mode' for short text generation or short text classification tasks?

Is there a significant performance difference between LLMs using a 'think mode' (like Chain-of-Thought) and those using a 'non-think mode' (direct prompting) for short text generation or short text ...

Coder

1,149

asked Feb 9 at 7:53

4 votes

0 answers

26 views

Unable to predict values for test data

I have build and trained a NMT model using Rnn in Google colab and Now when I am trying to predict for my test data my Google colab session keeps on crashing . The shape of my test data is 47838×55 ...

swar_codes

asked Jan 31 at 12:20

-1 votes

0 answers

9 views

Google colab session

I have build and trained a NMT model using Rnn in Google colab and downloaded the file in .keras format . Now when I am trying to load the model to test my data , my Google colab session keeps on ...

swar_codes

asked Jan 30 at 13:36

4 votes

1 answer

48 views

Does the fundamental difference between SFT and RL for training LLM is that RL utilizes negative data, whereas SFT only uses positive data?

Do generative tasks really need negative samples? Is removing low-quality SFT data equivalent to adding RL negative samples?

Coder

1,149

asked Jan 21 at 2:35

0 votes

0 answers

35 views

Guide me with my major project titled Satellite-Based Agricultural Vulnerability Monitoring

I am working on a major project titled Utilizing Satellite Data and Deep Learning to Monitor Agricultural Vulnerabilities to Climate Change. My goal is to develop a system to monitor agricultural ...

Shivani Toorpu

asked Dec 13, 2025 at 15:02

2 votes

0 answers

46 views

Feature selection for unsupervised learning with a One-Class SVM

I am trying to build a solution to detect a particular sound against all possible other sounds occuring in nature. My approach is to train a One-Class SVM only on my class of interest, hoping it will ...

Antoine101

asked Dec 12, 2025 at 15:34

0 votes

0 answers

25 views

UNETR paper — Does brain tumor segmentation use 5-fold CV or 80:15:5 split?

I'm reading the UNETR paper ("UNETR: Transformers for 3D Medical Image Segmentation") and I'm confused about the training and evaluation methodology for the brain tumor segmentation task on ...

AAA_11

asked Dec 7, 2025 at 2:44

0 votes

0 answers

25 views

Document Classification Task for Review Paper References

everyone! I'm doing for the first time research on how well LLMs and DL models can structuralize scattered data, through NER and RE. We are using a review paper on a domain that has no ontologies or ...

Daniel Farinha Ribeiro

asked Nov 27, 2025 at 23:27

2 votes

0 answers

108 views

Kaggle TPUv5e8 7 times slower than v3

I am trying to learn Kaggle TPU and I am migrating a Flower Classification notebook from an older TPU v3-8 environment to the new TPU v5e-8 (TPU VM) environment on Kaggle. I was trying to migrate this ...

Player Mathinson

asked Nov 20, 2025 at 21:06

5 votes

1 answer

96 views

Possible to Improve Reconstruction Quality and Accuracy with VAE?

I am training a VAE architecture on microscopy images. Dataset of 1000 training images, 253 testing images. Images are resized to 128x128 input or 256x256 input from original resolution which is ...

MT0820

asked Nov 13, 2025 at 18:49

0 votes

0 answers

20 views

Sequence generation model produces incorrect, but coherent outputs

My model takes in an image of a handwritten equation and converts it into its LaTeX representation. In order to do this, it uses a ResNet50 pre-trained model for feature extraction and a transformer ...

alt_zancudo

asked Nov 12, 2025 at 17:27

0 votes

0 answers

13 views

How to calculate weights for two parallel transformer outputs

I have a model where I incorporate additional input to language sequence data. I put these two data into two different transformers then combine them using addition. Simply, I combine them with a ...

cuneyttyler

asked Oct 25, 2025 at 12:23

0 votes

0 answers

65 views

Why bias value is critical to successful learning?

Given basic elements of a neuron(as below) with a bias value: I learnt that, a bias value allows you to shift the activation function(say sigmoid function) to the left or right, which may be critical ...

overexchange

asked Oct 19, 2025 at 20:32

15 30 50 per page

2 3 4 5

…

319 Next

Stack Exchange Network

Questions tagged [deep-learning]

Model's forecasts are not anchored correctly to the history

Why my Mamba failed on IEEG time series data?

What are the difference between LLMs using a 'think mode' and using a 'non-think mode' for short text generation or short text classification tasks?

Unable to predict values for test data

Google colab session

Does the fundamental difference between SFT and RL for training LLM is that RL utilizes negative data, whereas SFT only uses positive data?

Guide me with my major project titled Satellite-Based Agricultural Vulnerability Monitoring

Feature selection for unsupervised learning with a One-Class SVM

UNETR paper — Does brain tumor segmentation use 5-fold CV or 80:15:5 split?

Document Classification Task for Review Paper References

Kaggle TPUv5e8 7 times slower than v3

Possible to Improve Reconstruction Quality and Accuracy with VAE?

Sequence generation model produces incorrect, but coherent outputs

How to calculate weights for two parallel transformer outputs

Why bias value is critical to successful learning?

Hot Network Questions