All Questions
9,651 questions
0
votes
1
answer
16
views
Cannot interpret 'dtype('int64')' in NLP Python Code from adashofdata
I am trying to run the NLP project shared at https://github.com/adashofdata/nlp-in-python-tutorial, but I am encountering an issue with the code in 4-Topic-Modeling.ipynb. I am running the code on ...
1
vote
1
answer
30
views
import gensim binary incompatibility
import gensim
import numpy
import scipy
print("gensim version:", gensim.__version__)
print("numpy version:", numpy.__version__)
print("scipy version:", scipy.__version__)
...
3
votes
0
answers
32
views
Can older spaCy models be ported to future spaCy versions?
The latest spaCy versions have better performance and compatibility for GPU acceleration on Apple devices, but I have an existing project that depends on spaCy 3.1.4 and some of the specific behavior ...
0
votes
0
answers
55
views
Sentencepiece not generating models after preprocessing (SOLVED)
So this is the log that I see on the terminal:
sentencepiece_trainer.cc(78) LOG(INFO) Starts training with :
trainer_spec {
input: C:\Users\xxxx\OneDrive\Documents\Projects\py\xxxxx\data\...
0
votes
1
answer
107
views
Why does Presidio with spacy nlp engine not recognize organizations and PESEL while spaCy does?
I'm using spaCy with the pl_core_news_lg model to extract named entities from Polish text. It correctly detects both organizations (ORG) and people's names (PER):
import spacy
nlp = spacy.load("...
0
votes
1
answer
72
views
SFTTrainer Error : prepare_model_for_kbit_training() got an unexpected keyword argument 'gradient_checkpointing_kwargs'
I'm trying to fine-tune a model using SFTTrainer from trl.
This is how my SFTConfig arguments look like,
from trl import SFTConfig
training_arguments = SFTConfig(
output_dir=output_dir,
...
0
votes
0
answers
32
views
how to modify a step or a prompt of an existing langchain chain (customize SelfQueryRetriever)?
I need to customize a SelfQueryRetriever(the reason is: the generated target queries in OpenSearch are being generated incorrrectly so we need to tune prompts + we need to add some custom behavior ...
0
votes
0
answers
36
views
Converting data into spacy format "convert_to_spacy_format" in Name entity recognition Model
Dataset structureCan somebody help me with the NER model in converting the data into spacy format.
The dataset format is shown in the screenshot here (https://www.kaggle.com/datasets/naseralqaydeh/...
1
vote
0
answers
42
views
How can I deploy and run a Flask web application using heavy NLP libraries (pandas, numpy, sklearn) on a SiteGround shared hosting plan?
I have a Flask-based web application that performs NLP tasks using libraries like pandas, numpy, sklearn, and nltk. I've tried deploying it to my current hosting (SiteGround shared hosting plan), but ...
2
votes
0
answers
40
views
How to normalize ingredient names in a recipe dataset and handle NOUN + NOUN cases using spaCy in python?
I'm working on normalizing ingredient names from a recipe dataset using Python and spaCy. My goal is to extract only the relevant ingredients and ignore measurement units, fractions, and other ...
0
votes
0
answers
47
views
Calculate the gradient with respect to attention but also the FFN layers for a pre-trained LLMs
I would like to return the gradient with respect to specific layers and the FFN layer in the Transformer architecture of pre-trained LLMs from the hugging-face model. Is that even possible?
I am ...
0
votes
1
answer
40
views
Store images instead of showing in a server
I am running the code found on this site in my server and I would like to store images instead of showing them since I have connected remotely with an ssh connection to my server via an SSH connection....
0
votes
1
answer
74
views
Upserting in Pinecone takes too long
I'm trying to upsert reviews that i've scraped into pinecone. For the embedding model im using jina-embedding-v3. For 204 reviews this takes around 2.5 hours! in Colab. Tried using GPU but the ...
0
votes
0
answers
52
views
how to pass additional query filters to a SelfQueryRetriever?
We are implementing a SelfQueryRetriever using OpenSearch as vectorstore, in general it works fine generating the metadata filters from the user query but we need some way to append other filters to ...
0
votes
0
answers
31
views
Spacy rules matching entities before text
I'm trying to write a spacy parser to extract the names and terms of a contract.
To do that, I've written a rule to extract the sellers and buyers, except it's extracting multiple times over a simple ...