Newest 'huggingface+machine-learning' Questions

0 votes

0 answers

27 views

TrainingArguments: Do "packing" and "group_by_length" counteract each other?

In the HuggingFace's TrainingArguments and SFTConfig (inheriting from TrainingArguments), there are two arguments for initializing SFTConfig(): group_by_length: Whether or not to group together ...

JoyfulPanda

1,057

asked Mar 25 at 11:13

0 votes

1 answer

59 views

Getting Cuda out of memory when importing microsoft/Orca-2-13b from hugging faces

I am using Ubuntu 24.04.1 on an AWS EC2 instance g5.8xlarge. I am receiving the following error message: OutOfMemoryError: Allocation on device Code: import os os.environ["...

Wolfy

470

asked Mar 12 at 6:05

-1 votes

1 answer

52 views

Deconstructiong the Stable Diffusion 3.5 pipeline

I am trying to deconstruct the SD3.5 (specifically 3.5 medium) pipeline in order to have a controlled process over the denoising steps. I can't do callbacks because I need to modify the latent ...

Curious Scientist

3

asked Mar 8 at 17:06

1 vote

1 answer

96 views

Why does my Llama 3.1 model act differently between AutoModelForCausalLM and LlamaForCausalLM?

I have one set of weights, one tokenizer, the same prompt, and identical generation parameters. Yet somehow, when I load the model using AutoModelForCausalLM, I get one output, and when I construct it ...

han mo

23

asked Mar 8 at 8:24

6 votes

2 answers

2k views

Why does HuggingFace-provided Deepseek code result in an 'Unknown quantization type' error?

I am using this code from huggingface: This code is directly pasted from the HuggingFace website's page on deepseek and is supposed to be plug-and-play code: from transformers import pipeline ...

Akshit Gulyan

69

asked Feb 9 at 3:05

0 votes

1 answer

74 views

Image segmentation ONNX from huggingface produces very diferent results when used in ML.Net

I have been trying to get an image segmentation model from huggingface (RMBG-2.0) to work for inference using ML.NET. After a lot of trial and error, I finally got the code to compile and produce an ...

alepee

1

asked Feb 4 at 9:38

0 votes

1 answer

393 views

How to Log Training Loss at Step Zero in Hugging Face Trainer or SFT Trainer?

I'm using the Hugging Face Trainer (or SFTTrainer) for fine-tuning, and I want to log the training loss at step 0 (before any training steps are executed). I know there's an eval_on_start option for ...

Charlie Parker

5,486

asked Nov 28, 2024 at 0:23

1 vote

1 answer

159 views

How to Compute Teacher-Forced Accuracy (TFA) for Hugging Face Models While Handling EOS Tokens?

I am trying to compute Teacher-Forced Accuracy (TFA) for Hugging Face models, ensuring the following: EOS Token Handling: The model should be rewarded for predicting the first EOS token. Ignoring ...

Charlie Parker

5,486

asked Nov 21, 2024 at 0:25

0 votes

0 answers

173 views

Jupyter Notebook is crashing when I want to run HuggingFace models

I am using Jupyter Notebook for running some ML models from HuggingFace. I am using Mac (M2 Chip, Memory 32 GB) This is my code: import torch from transformers import AutoTokenizer, ...

taga

3,913

asked Nov 12, 2024 at 12:02

1 vote

1 answer

235 views

How to add EOS when training T5?

I'm a little puzzled where (and if) EOS tokens are being added when using Huggignface's trainer classes to train a T5 (LongT5 actually) model. The data set contains pairs of text like this: from to ...

gphilip

706

asked Oct 15, 2024 at 4:22

0 votes

1 answer

77 views

An error occurs during the execution of UNet when the batch size is not equal to 1

I'm trying to run a Stable Diffusion model using the code provided in the DDIM Inversion tutorial. However, when the input's batch size is set to a value greater than 1 (e.g., 32), I encounter the ...

young

11

asked Oct 9, 2024 at 16:32

0 votes

0 answers

70 views

ValueError: If no `decoder_input_ids` or `decoder_inputs_embeds` are passed, `input_ids` cannot be `None`

I am trying to get the decoder hidden state of the florence 2 model. I was following this https://huggingface.co/microsoft/Florence-2-large/blob/main/modeling_florence2.py to understand the parameters ...

user10418143

352

asked Oct 4, 2024 at 2:40

0 votes

1 answer

74 views

AutoModelForSequenceClassification loss not decrease

from datasets import load_dataset from torch.utils.data import DataLoader from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch from tqdm import tqdm def ...

naivebird

33

asked Sep 21, 2024 at 16:24

0 votes

0 answers

142 views

HuggingFace accelerate device error when running evaluation

I am running some experiments on a multi-GPU cluster, and I'm using accelerate. I'm trying to calculate some metrics after every batch iteration in the training dataloader. While the training code ...

M. Koopmans

25

asked Aug 13, 2024 at 9:54

0 votes

1 answer

132 views

How to reinitialize from scratch GPT2 XL in HuggingFace?

I'm trying to confirm that my GPT-2 model is being trained from scratch, rather than using any pre-existing pre-trained weights. Here's my approach: Load the pre-trained GPT-2 XL model: I load a pre-...

Charlie Parker

5,486

asked Aug 11, 2024 at 20:27

Collectives™ on Stack Overflow

All Questions

TrainingArguments: Do "packing" and "group_by_length" counteract each other?

Getting Cuda out of memory when importing microsoft/Orca-2-13b from hugging faces

Deconstructiong the Stable Diffusion 3.5 pipeline

Why does my Llama 3.1 model act differently between AutoModelForCausalLM and LlamaForCausalLM?

Why does HuggingFace-provided Deepseek code result in an 'Unknown quantization type' error?

Image segmentation ONNX from huggingface produces very diferent results when used in ML.Net

How to Log Training Loss at Step Zero in Hugging Face Trainer or SFT Trainer?

How to Compute Teacher-Forced Accuracy (TFA) for Hugging Face Models While Handling EOS Tokens?

Jupyter Notebook is crashing when I want to run HuggingFace models

How to add EOS when training T5?

An error occurs during the execution of UNet when the batch size is not equal to 1

ValueError: If no `decoder_input_ids` or `decoder_inputs_embeds` are passed, `input_ids` cannot be `None`

AutoModelForSequenceClassification loss not decrease

HuggingFace accelerate device error when running evaluation

How to reinitialize from scratch GPT2 XL in HuggingFace?

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags