Skip to main content

All Questions

Tagged with
0 votes
1 answer
16 views

Cannot interpret 'dtype('int64')' in NLP Python Code from adashofdata

I am trying to run the NLP project shared at https://github.com/adashofdata/nlp-in-python-tutorial, but I am encountering an issue with the code in 4-Topic-Modeling.ipynb. I am running the code on ...
mymiracl's user avatar
  • 579
1 vote
1 answer
30 views

import gensim binary incompatibility

import gensim import numpy import scipy print("gensim version:", gensim.__version__) print("numpy version:", numpy.__version__) print("scipy version:", scipy.__version__) ...
Nguyễn Anh Minh's user avatar
3 votes
0 answers
32 views

Can older spaCy models be ported to future spaCy versions?

The latest spaCy versions have better performance and compatibility for GPU acceleration on Apple devices, but I have an existing project that depends on spaCy 3.1.4 and some of the specific behavior ...
synchronizer's user avatar
  • 2,105
0 votes
0 answers
55 views

Sentencepiece not generating models after preprocessing (SOLVED)

So this is the log that I see on the terminal: sentencepiece_trainer.cc(78) LOG(INFO) Starts training with : trainer_spec { input: C:\Users\xxxx\OneDrive\Documents\Projects\py\xxxxx\data\...
Crazy Programmer's user avatar
0 votes
1 answer
107 views

Why does Presidio with spacy nlp engine not recognize organizations and PESEL while spaCy does?

I'm using spaCy with the pl_core_news_lg model to extract named entities from Polish text. It correctly detects both organizations (ORG) and people's names (PER): import spacy nlp = spacy.load("...
Maltion's user avatar
  • 79
0 votes
1 answer
72 views

SFTTrainer Error : prepare_model_for_kbit_training() got an unexpected keyword argument 'gradient_checkpointing_kwargs'

I'm trying to fine-tune a model using SFTTrainer from trl. This is how my SFTConfig arguments look like, from trl import SFTConfig training_arguments = SFTConfig( output_dir=output_dir, ...
sabira kabeer's user avatar
0 votes
0 answers
32 views

how to modify a step or a prompt of an existing langchain chain (customize SelfQueryRetriever)?

I need to customize a SelfQueryRetriever(the reason is: the generated target queries in OpenSearch are being generated incorrrectly so we need to tune prompts + we need to add some custom behavior ...
Luis Leal's user avatar
  • 3,534
0 votes
0 answers
36 views

Converting data into spacy format "convert_to_spacy_format" in Name entity recognition Model

Dataset structureCan somebody help me with the NER model in converting the data into spacy format. The dataset format is shown in the screenshot here (https://www.kaggle.com/datasets/naseralqaydeh/...
Rohit Gupta's user avatar
1 vote
0 answers
42 views

How can I deploy and run a Flask web application using heavy NLP libraries (pandas, numpy, sklearn) on a SiteGround shared hosting plan?

I have a Flask-based web application that performs NLP tasks using libraries like pandas, numpy, sklearn, and nltk. I've tried deploying it to my current hosting (SiteGround shared hosting plan), but ...
bsraskr's user avatar
  • 695
2 votes
0 answers
40 views

How to normalize ingredient names in a recipe dataset and handle NOUN + NOUN cases using spaCy in python?

I'm working on normalizing ingredient names from a recipe dataset using Python and spaCy. My goal is to extract only the relevant ingredients and ignore measurement units, fractions, and other ...
Островська Катя's user avatar
0 votes
0 answers
47 views

Calculate the gradient with respect to attention but also the FFN layers for a pre-trained LLMs

I would like to return the gradient with respect to specific layers and the FFN layer in the Transformer architecture of pre-trained LLMs from the hugging-face model. Is that even possible? I am ...
Jose Ramon's user avatar
  • 5,410
0 votes
1 answer
40 views

Store images instead of showing in a server

I am running the code found on this site in my server and I would like to store images instead of showing them since I have connected remotely with an ssh connection to my server via an SSH connection....
Jose Ramon's user avatar
  • 5,410
0 votes
1 answer
74 views

Upserting in Pinecone takes too long

I'm trying to upsert reviews that i've scraped into pinecone. For the embedding model im using jina-embedding-v3. For 204 reviews this takes around 2.5 hours! in Colab. Tried using GPU but the ...
Daaku-C5's user avatar
0 votes
0 answers
52 views

how to pass additional query filters to a SelfQueryRetriever?

We are implementing a SelfQueryRetriever using OpenSearch as vectorstore, in general it works fine generating the metadata filters from the user query but we need some way to append other filters to ...
Luis Leal's user avatar
  • 3,534
0 votes
0 answers
31 views

Spacy rules matching entities before text

I'm trying to write a spacy parser to extract the names and terms of a contract. To do that, I've written a rule to extract the sellers and buyers, except it's extracting multiple times over a simple ...
kernel density's user avatar

15 30 50 per page
1
2 3 4 5
644