All Questions
118 questions
0
votes
1
answer
59
views
Getting Cuda out of memory when importing microsoft/Orca-2-13b from hugging faces
I am using Ubuntu 24.04.1 on an AWS EC2 instance g5.8xlarge.
I am receiving the following error message:
OutOfMemoryError: Allocation on device
Code:
import os
os.environ["...
-1
votes
0
answers
56
views
getting the text and tokens using layoutlmv3
I trained a layoutlmv3 model creating a label dataset using label studio. i was able to test the output of the model using the following code
encoding = processor(image, words, boxes=boxes, ...
1
vote
0
answers
67
views
Runtime Error while trying to train RTDetrV2 with Transformer
I am trying to train RTDetrV2 for detection on water meter digit. I use an ipynb file form here https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/train-rt-detr-on-...
2
votes
1
answer
93
views
Error in getting Captum text explanations for text classification
I have the following code that I am using to identify the most influential words used to correctly predict the text in the test dataset
import pandas as pd
import torch
from torch.utils.data import ...
0
votes
0
answers
64
views
Memory increasing after hugging face generate method
I wanted to make an inference with codegemma model from huggingface, but when I use model.generate(**inputs) method GPU memory cost increases from 39 GB to 49 GB in peak usage with torch profiler no ...
1
vote
1
answer
159
views
How to Compute Teacher-Forced Accuracy (TFA) for Hugging Face Models While Handling EOS Tokens?
I am trying to compute Teacher-Forced Accuracy (TFA) for Hugging Face models, ensuring the following:
EOS Token Handling: The model should be rewarded for predicting the first EOS token.
Ignoring ...
0
votes
0
answers
628
views
PyTorch model running on CPU despite MPS (Apple Silicon) being available and detected
I'm trying to run a HuggingFace Transformers model on my Apple Silicon Mac using MPS (Metal Performance Shaders), but despite MPS being available and detected, the model keeps running on CPU, causing ...
0
votes
0
answers
70
views
ValueError: If no `decoder_input_ids` or `decoder_inputs_embeds` are passed, `input_ids` cannot be `None`
I am trying to get the decoder hidden state of the florence 2 model. I was following this https://huggingface.co/microsoft/Florence-2-large/blob/main/modeling_florence2.py to understand the parameters ...
1
vote
0
answers
507
views
How to run Qwen2-VL models on multiple gpus?
I have 4 gpus that I want to run Qwen2 VL models but I get "device-side assert triggered.Compile with TORCH_USE_CUDA_DSA to enable device-side assertions" error.
model_name="Qwen/Qwen2-...
0
votes
0
answers
278
views
What does the "AttributeError: 'NoneType' object has no attribute 'cget_managed_ptr'" mean?
I'm trying to train a model with very standard HF code I've used before:
import os
from transformers import Trainer, TrainingArguments, AutoModelForCausalLM, AutoTokenizer
from datasets import ...
1
vote
1
answer
2k
views
Can't suppress warning from transformers/src/transformers/modeling_utils.py
My implementation for the AutoModel AutoTokenizer classes are fairly simple:
from transformers import AutoModel, AutoTokenizer
import numpy as np
from rank_bm25 import BM25Okapi
from sklearn.neighbors ...
1
vote
0
answers
289
views
CUDA Out of Memory Error Despite Having Multiple GPUs
I'm encountering a CUDA out-of-memory error while trying to run a PyTorch model, even though my system has multiple NVIDIA GPUs.
# Load the tokenizer and model
tokenizer = AutoTokenizer....
0
votes
0
answers
101
views
Timeseries Transformer for Custom Dataset
I am trying to try something using a transformer from HuggingFace. Specifically the Time Series Transformer. I can't seem to figure out how to initialize it and just run a single forward pass. My ...
0
votes
0
answers
86
views
Huggingface autograd
I am trying to fine-tune (LoRA fine-tune) a pretrained language model. I've encountered that I the gradients are not being back propagated. At first, I thought it was because I was using Huggingface's ...
0
votes
1
answer
348
views
How can I avoid unbalanced memory usage when performing multi-gpu training using Huggingface Trainer?
I am attempting to fine-tune Google's flan-t5-large model (only 783M parameters, so it can easily fit on a much smaller single GPU than any of the ones I'm using) on multiple GPUs using the ...