Newest 'nlp+python+pandas' Questions

0 votes

1 answer

59 views

catelog sentences into 5 words that represent them

I have dataframe with 1000 text rows. df['text'] I also have 5 words that I want to know for each one of them how much they represnt the text (between 0 to 1) every score will be in df["word1&...

rafine

471

asked Dec 19, 2024 at 10:16

0 votes

1 answer

84 views

Counting the Frequency of Some Words within some other Key Words in Text

I have two sets of word lists - first one I called search words and the second one I called key words. My goal is to calculate the frequency of search words within 10 words of key words. For example, ...

Sharif

391

asked Dec 5, 2024 at 3:05

-1 votes

2 answers

106 views

With spaCy, how can I get all lemmas from a string?

I have a pandas data frame with a column of text values (documents). I want to apply lemmatization on these values with the spaCy library using the pandas apply function. I've defined my to_lemma ...

Patrick

2,347

asked Oct 12, 2024 at 21:03

3 votes

1 answer

49 views

Text summarization with deep learning

I'm finetuning the Mt5 model on the Arabic part of the Xl-sum data set For ten epochs and the resulted manipulation model was stored in hugging face library, there were good results on the training ...

Noor

31

asked Aug 1, 2024 at 23:25

1 vote

2 answers

70 views

Identify starting row of actual data in Pandas DataFrame with merged header cells

My original df looks like this - df Note in the data frame: The headers are there till row 3 & from row 4 onwards, the values for those headers are starting. The numbers of rows & columns ...

Debojit Roy

11

asked Jul 20, 2024 at 10:55

0 votes

0 answers

79 views

Named entity recognition (NER) task on a large dataset from a data frame column using chunking, and append to results to the original data frame

I want to perform a NER task on a column of a dataframe. The shape of the dataframe is: import pandas df.shape() (1312, 12) Now the column I wanted to use is called the TEXT column for the ...

ARJ

2,080

asked May 2, 2024 at 8:50

-1 votes

1 answer

181 views

removing paywall language from piece of text (pandas) [closed]

I'm trying to do some preprocessing on my dataset. Specifically, I'm trying to remove paywall language from the text (in bold below) but I keep getting an empty string as my output. Here is the sample ...

Yves

47

asked Mar 15, 2024 at 6:30

0 votes

2 answers

94 views

How to optimize the function which uses looping on lists on pandas dataframe?

I am using a function on a pandas dataframe as : import spacy from collections import Counter # Load English language model nlp = spacy.load("en_core_web_sm") # Function to filter out only ...

Atom Store

1,016

asked Mar 14, 2024 at 2:56

0 votes

1 answer

61 views

Matching strings containing 'and' in different languages and ampersands

Suppose that in 2 different data frames df1, df2 I have 2 columns df1['film'] = pd.Series(['Beavis & Butthead', 'Bonnie e Clyde', 'Adam & Eve']) df2['film'] = pd.Series(['Beavis und Butthead', ...

Azamat Bagatov

149

asked Feb 29, 2024 at 21:58

0 votes

0 answers

57 views

Is there a faster method to process pandas list of string values

There are 13000 values approximately for a given column. The below function works in a way that the input is a list of strings and does the NER tagging for each word in the list. On an average there ...

srinivas muralidharan

39

asked Feb 14, 2024 at 10:38

1 vote

1 answer

49 views

Error in unit testing on pre-processing raw data

import pandas as pd import spacy from spacy.lang.en.stop_words import STOP_WORDS import nltk nlp = spacy.load("en_core_web_md") class fileread: def readfile(self): file_path = '...

vinamrata

11

asked Feb 14, 2024 at 6:56

0 votes

1 answer

102 views

Keras ValueError: cannot reshape array of size

I'm facing an error which I can't understand using Keras for a prediction task. Here is my code: import numpy as np import pandas as pd from sklearn.preprocessing import MinMaxScaler from keras.models ...

sobhan soleimani

1

asked Jan 21, 2024 at 19:32

0 votes

1 answer

296 views

NLP preprocessing text in Data Frame, what is the correct order?

I’m trying to preprocess a data frame with two columns. Each cell contains a string, called "title" and "body". Based on this article I tried to reproduce the preprocessing. ...

Louis

341

asked Nov 26, 2023 at 14:53

0 votes

1 answer

131 views

NLP pre-processing on two columns in data frame gives error

I have the following data frame: gmeDateDf.head(2) title score id url comms_num body timestamp It's not about the money, it's about sending a... 55.0 l6ulcx https://v.redd.it/6j75regs72e61 6.0 NaN ...

Louis

341

asked Nov 22, 2023 at 22:54

1 vote

1 answer

50 views

Using a Word Counter in Python is understating results

As a complete preface, I am a beginner and learning. But, here's the sample schema of my products review table. Record_ID Product_ID Review Comment 1234 89847457 I love this product it was shipped ...

user14452102

19

asked Oct 30, 2023 at 22:25

Collectives™ on Stack Overflow

All Questions

catelog sentences into 5 words that represent them

Counting the Frequency of Some Words within some other Key Words in Text

With spaCy, how can I get all lemmas from a string?

Text summarization with deep learning

Identify starting row of actual data in Pandas DataFrame with merged header cells

Named entity recognition (NER) task on a large dataset from a data frame column using chunking, and append to results to the original data frame

removing paywall language from piece of text (pandas) [closed]

How to optimize the function which uses looping on lists on pandas dataframe?

Matching strings containing 'and' in different languages and ampersands

Is there a faster method to process pandas list of string values

Error in unit testing on pre-processing raw data

Keras ValueError: cannot reshape array of size

NLP preprocessing text in Data Frame, what is the correct order?

NLP pre-processing on two columns in data frame gives error

Using a Word Counter in Python is understating results

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags