Questions tagged [deep-learning]
a new area of Machine Learning research concerned with the technologies used for learning hierarchical representations of data, mainly done with deep neural networks (i.e. networks with two or more hidden layers), but also with some sort of Probabilistic Graphical Models.
4,782 questions
3
votes
0
answers
30
views
Model's forecasts are not anchored correctly to the history
I'm implementing this paper
and trying to train it on a generated data and return full ground truths and a single forecast but the forecasts my model is producing are not anchored to the past series ...
2
votes
1
answer
32
views
Why my Mamba failed on IEEG time series data?
I tried to implement Mamba on IEEG time series data where brain waves at different regions are identified with 800 data points for 4 seconds.
I have 102 similar samples of a person with data from 5 ...
5
votes
0
answers
28
views
What are the difference between LLMs using a 'think mode' and using a 'non-think mode' for short text generation or short text classification tasks?
Is there a significant performance difference between LLMs using a 'think mode' (like Chain-of-Thought) and those using a 'non-think mode' (direct prompting) for short text generation or short text ...
4
votes
0
answers
26
views
Unable to predict values for test data
I have build and trained a NMT model using Rnn in Google colab and Now when I am trying to predict for my test data my Google colab session keeps on crashing . The shape of my test data is 47838×55
...
-1
votes
0
answers
9
views
Google colab session
I have build and trained a NMT model using Rnn in Google colab and downloaded the file in .keras format . Now when I am trying to load the model to test my data , my Google colab session keeps on ...
4
votes
1
answer
48
views
Does the fundamental difference between SFT and RL for training LLM is that RL utilizes negative data, whereas SFT only uses positive data?
Do generative tasks really need negative samples?
Is removing low-quality SFT data equivalent to adding RL negative samples?
0
votes
0
answers
35
views
Guide me with my major project titled Satellite-Based Agricultural Vulnerability Monitoring
I am working on a major project titled Utilizing Satellite Data and Deep Learning to Monitor Agricultural Vulnerabilities to Climate Change. My goal is to develop a system to monitor agricultural ...
2
votes
0
answers
46
views
Feature selection for unsupervised learning with a One-Class SVM
I am trying to build a solution to detect a particular sound against all possible other sounds occuring in nature.
My approach is to train a One-Class SVM only on my class of interest, hoping it will ...
0
votes
0
answers
25
views
UNETR paper — Does brain tumor segmentation use 5-fold CV or 80:15:5 split?
I'm reading the UNETR paper ("UNETR: Transformers for 3D Medical Image Segmentation") and I'm confused about the training and evaluation methodology for the brain tumor segmentation task on ...
0
votes
0
answers
25
views
Document Classification Task for Review Paper References
everyone!
I'm doing for the first time research on how well LLMs and DL models can structuralize scattered data, through NER and RE. We are using a review paper on a domain that has no ontologies or ...
2
votes
0
answers
108
views
Kaggle TPUv5e8 7 times slower than v3
I am trying to learn Kaggle TPU and I am migrating a Flower Classification notebook from an older TPU v3-8 environment to the new TPU v5e-8 (TPU VM) environment on Kaggle. I was trying to migrate this ...
5
votes
1
answer
96
views
Possible to Improve Reconstruction Quality and Accuracy with VAE?
I am training a VAE architecture on microscopy images. Dataset of 1000 training images, 253 testing images. Images are resized to 128x128 input or 256x256 input from original resolution which is ...
0
votes
0
answers
20
views
Sequence generation model produces incorrect, but coherent outputs
My model takes in an image of a handwritten equation and converts it into its LaTeX representation. In order to do this, it uses a ResNet50 pre-trained model for feature extraction and a transformer ...
0
votes
0
answers
13
views
How to calculate weights for two parallel transformer outputs
I have a model where I incorporate additional input to language sequence data. I put these two data into two different transformers then combine them using addition. Simply, I combine them with a ...
0
votes
0
answers
65
views
Why bias value is critical to successful learning?
Given basic elements of a neuron(as below) with a bias value:
I learnt that, a bias value allows you to shift the activation function(say sigmoid function) to the left or right, which may be critical ...