Newest 'machine-learning+pytorch+nlp' Questions

2 votes

1 answer

93 views

Error in getting Captum text explanations for text classification

I have the following code that I am using to identify the most influential words used to correctly predict the text in the test dataset import pandas as pd import torch from torch.utils.data import ...

Nayantara Jeyaraj

2,706

asked Dec 3, 2024 at 12:47

0 votes

0 answers

64 views

Memory increasing after hugging face generate method

I wanted to make an inference with codegemma model from huggingface, but when I use model.generate(**inputs) method GPU memory cost increases from 39 GB to 49 GB in peak usage with torch profiler no ...

user17751265

asked Nov 23, 2024 at 19:21

0 votes

0 answers

30 views

NaN loss when training LSTM-Attention

During model training, the loss value suddenly became Nan. Even though I change the parameters a lot, it still failed. I checked the error when training and it prints the error in the output, not the ...

Khatu Huynh

1

asked Oct 11, 2024 at 13:59

0 votes

1 answer

84 views

SBERT Fine-tuning always stops before finish all epochs

I'm working on a project using the SBERT pre-trained models (specifically MiniLM) for a text classification project with 995 classifications. I am following the steps laid out here for the most part ...

SohmOuse

11

asked Sep 25, 2024 at 3:55

0 votes

0 answers

69 views

Transformer Model Repeating Same Codon During Inference Despite High Training Accuracy

I'm working on a transformer-based model to translate amino acids to codons. During training and validation, my model achieves 95-98% accuracy. However, during inference, I encounter an issue where ...

Farshid B

1

asked Jul 21, 2024 at 11:06

0 votes

1 answer

534 views

ImportError and TypeError Issues in Nougat OCR with BARTDecoder and cached_property

I'm facing issues while running an OCR process using Nougat with two different errors for two different users. The errors are related to importing cached_property and an unexpected keyword argument ...

Charlie Parker

5,486

asked Jun 8, 2024 at 5:43

0 votes

0 answers

55 views

Text to Openpose and Weird RNN bugs

I want to create AI that generate openpose from textual description for example if input "a man running" output would be like the image I provided Is there any model architecture recommend ...

Peemmaphat Sripongsai

1

asked May 19, 2024 at 17:37

1 vote

2 answers

882 views

Why do I get different embeddings when I perform batch encoding in huggingface MT5 model?

I am trying to encode some text using HuggingFace's mt5-base model. I am using the model as shown below from transformers import MT5EncoderModel, AutoTokenizer model = MT5EncoderModel.from_pretrained(...

BBloggsbott

406

asked Mar 11, 2024 at 10:14

0 votes

1 answer

127 views

Why token embedding different from the embedding by the BartForConditionalGeneration model

Why both the embeddings are different even when i generate them using same BartForConditionalGenration model? First embedding is generated by combining token embedding and positional embedding from ...

New_user

15

asked Jan 30, 2024 at 13:18

0 votes

1 answer

359 views

How to convert Spacy Model .pkl file into .pt/.pth pytorch supported format

I have spacy model which I am using for inference in .pkl format. The datatype of .pkl file is <class 'spacy.lang.en.English'>. I want to make inference script run on GPU. I tried using ...

RajeshM

89

asked Jan 22, 2024 at 7:43

0 votes

2 answers

297 views

Create a multilingual chatbot

I created a chatbot using PyTorch an I want to make it support the French language. Note that I want to train the chatbot so that it can respond to technical questions. One of the things that came to ...

Amine

1

asked Dec 30, 2023 at 23:05

0 votes

1 answer

1k views

OutOfMemoryError: CUDA out of memory in LLM

I have a list of texts and I need to send each text to large language model(llama2-7b). However I am getting CUDA out of memory error. I am running on A100 on Google Colab. Here is my try: path = &...

grey

59

asked Dec 15, 2023 at 11:25

2 votes

1 answer

2k views

How does one reinitialize the weights of a Hugging Face LLaMA v2 model the official way as the original model?

I want to reinitialize the weights of a LLaMA v2 model I'm using/downloading. I went through all the documentation and the source code from their Hugging Face code: https://github.com/huggingface/...

Charlie Parker

5,486

asked Nov 17, 2023 at 3:15

2 votes

2 answers

2k views

How to get perplexity per token rather than average perplexity?

I can get the perplexity of a whole sentence from here: device = "cuda" from transformers import GPT2LMHeadModel, GPT2TokenizerFast device = "cuda" model_id = "gpt2" ...

Penguin

2,581

asked Nov 6, 2023 at 17:30

0 votes

1 answer

180 views

How does an instance of pytorch's `nn.Linear()` process a tuple of tensors?

In the annotated transformer's implementation of multi-head attention, three tensors (query, key, value) are all passed to a nn.Linear(d_model, d_model): # some class definition ... self.linears = ...

Lukas

543

asked Oct 27, 2023 at 10:43

Collectives™ on Stack Overflow

All Questions

Error in getting Captum text explanations for text classification

Memory increasing after hugging face generate method

NaN loss when training LSTM-Attention

SBERT Fine-tuning always stops before finish all epochs

Transformer Model Repeating Same Codon During Inference Despite High Training Accuracy

ImportError and TypeError Issues in Nougat OCR with BARTDecoder and cached_property

Text to Openpose and Weird RNN bugs

Why do I get different embeddings when I perform batch encoding in huggingface MT5 model?

Why token embedding different from the embedding by the BartForConditionalGeneration model

How to convert Spacy Model .pkl file into .pt/.pth pytorch supported format

Create a multilingual chatbot

OutOfMemoryError: CUDA out of memory in LLM

How does one reinitialize the weights of a Hugging Face LLaMA v2 model the official way as the original model?

How to get perplexity per token rather than average perplexity?

How does an instance of pytorch's `nn.Linear()` process a tuple of tensors?

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags