68 questions
1
vote
0
answers
64
views
Version of transformers to use for including support for the Gemma model/tokenizer
fastapi==0.109.0
uvicorn[standard]==0.27.0
pydantic==2.6.0
torch==2.2.0
transformers==4.36.2
huggingface-hub==0.19.4
numpy==1.26.3
onnxruntime==1.17.0
optimum[onnxruntime]==1.16.0
requests==2.31.0
...
0
votes
1
answer
60
views
Generating partial string as output after fine-tuning T5 model
I'm using fine-tuned T5 model for performing spell checks in my dataset of consisting of reviews. However, I'm facing an issue where the model when performing spell checks does not give entire string ...
6
votes
3
answers
6k
views
How to fix the transformer installation issue?
I am trying to train a T5 model based on spellcheck by providing it with a sample csv file in google colab. Initially I ran the code with my personal device (python locally installed) and it executed ...
0
votes
1
answer
36
views
Why my nlp model reload many times when processing question?
After receiving question, my program calls the run_predict function then finds the best paragraph match with the question.
After that, my model is constantly reloaded without knowing the reasons.
from ...
1
vote
0
answers
84
views
Giving the same for every question for a given context
I am using Simple Transformers Question & Answer BERT model. I have created a custom dataset for question and answer model and I have trained the model using the custom dataset. It is asked to ...
0
votes
1
answer
141
views
Simple transformers ConvAI Model completely freezes and crashes PC. Two different problems on two different machines
I am trying to load and train a ConvAI Model, fairly new to the whole concept. Keep running into two main problems - one on my personal computer, one on a corporate machine.
On my personal machine, it ...
0
votes
1
answer
681
views
Custom Multihead Attention class leaks data for causal attention
I have been working on implementing a custom multi-head attention class in PyTorch for a Transformer model for learning purposes. My implementation lacks any functionality, I just want to make it work ...
0
votes
1
answer
68
views
Use of Params in pyspak
In this example, I am trying to use overrides as a Params object and I want it to be used as a list of strings.
But I am not able to assign its value using the below code.
class _AB(Params):
...
1
vote
0
answers
136
views
Issue with adding evaluation dataset with T5Model Training
I am training a T5 model from simpletransformers. I am getting an error in the following line -
model.train_model(train, eval_data = eval_data)
The ERROR is as follows -
*/usr/local/lib/python3.10/...
-1
votes
2
answers
368
views
simpletransforers use_cuda=True not working
wanted to try CUDA (I have an RTX 3070 TI) on my Windows setup, using this code:
import pandas as pd
from simpletransformers.classification import ClassificationModel
from sklearn.model_selection ...
1
vote
0
answers
1k
views
AttributeError: tensorflow has no attribute io
I am trying to train a seq2seq model using simpletransformers library. While using the Seq2Seq model, I am constantly getting this error
import tensorflow as tf
from simpletransformers.seq2seq import ...
1
vote
0
answers
448
views
How to do you load a model from a checkpoint using simple transformers?
I am using the simple transformers library, I have just finished training a model and now I want to load it to try making some predictions. However, I must be doing something wrong because it keeps ...
1
vote
1
answer
645
views
BERT NER detect multiple words as one entity
I' using bert to train custom ner model.i'm using simpletransformer pacakge. I have 2 custom entity - place, other
In dataset as for word column I have multiple words for particular label in row eg
...
3
votes
0
answers
987
views
Using .generate function for beam search over predictions in custom model extending TFPreTrainedModel class
I want to use .generate() functionality of hugging face in my model's predictions.
My model is a custom model inehriting from "TFPreTrainedModel" class and has a custom transformer ...
0
votes
1
answer
2k
views
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! when using transformers architecture
I am having a multi-gpu problem while practicing transformer through pytorch.All the training previously studied using pytorch was possible just by putting nn.dataparallel on the model object.However, ...