Newest 'nlp+python+nltk' Questions

0 votes

0 answers

44 views

Count of Combination of bigrams

I have create a dataset as follows using bigrams index | product_action -------------------------------------------------------| ('customer', 'called') | action ('customer', '...

Jane

1

asked Feb 28 at 10:47

1 vote

1 answer

69 views

How do I remove escape characters from output of nltk.word_tokenize?

How do I get rid of non-printing (escaped) characters from the output of the nltk.word_tokenize method? I am working through the book 'Natural Language Processing with Python' and am following the ...

green_ruby

51

asked Feb 18 at 20:10

1 vote

0 answers

28 views

I am getting error while running this line of code gnb.fit(df_train, y_train)

Title: ValueError: could not convert string to float when training GaussianNB for SMS Spam Detection Body: I'm building an SMS spam detection tool and encountering an error while predicting with a ...

Aditya Kumar

11

asked Jan 25 at 8:09

0 votes

0 answers

77 views

Issues with nltk's ne_chunk

I have been trying to use nltk's entity chunker, and tried different approaches but I keep getting the error: LookupError Traceback (most recent call last) ... ...

Sarah Tomori

13

asked Jan 10 at 12:19

1 vote

1 answer

55 views

Getting all leaf words (reverse stemming) into one Python List

On the same lines as the solution provided in this link, I am trying to get all leaf words of one stem word. I am using the community-contributed (@Divyanshu Srivastava) package get_word_forms Imagine ...

JodeCharger100

1,069

asked Dec 27, 2024 at 15:04

0 votes

2 answers

74 views

Determining most popular words in the English dictionary within a dictionary of words

Forgive me if my wording is awful, but I'm trying to figure out how to determine the most used words in the English language from a set of words in a dictionary I've made. I've done some research on ...

Harrison Sills

33

asked Dec 19, 2024 at 10:24

1 vote

1 answer

107 views

How to extract specific entities from unstructured text

Given a generic text sentence (in a specific context) how can I extract word/entities of interest belonging to a specific "category" using python and any NLP library? For example given a ...

Riccardo Raffini

396

asked Nov 26, 2024 at 15:46

1 vote

0 answers

47 views

How to parse multiple chunks in nltk?

is it possible to parse multiple chunks in a single nltk.regexp parser? can grammar have multiple chunks define like this? def parser(s): grammar = """ NP: {<DT>?<JJ>...

konto

11

asked Aug 18, 2024 at 10:30

7 votes

2 answers

11k views

Unable to use nltk functions

I was trying to run some nltk functions on the UCI spam message dataset but ran into this problem of word_tokenize not working even after downloading dependencies. import nltk nltk.download('punkt') ...

Utsav Jana

71

asked Aug 12, 2024 at 15:17

0 votes

0 answers

59 views

why isn't tf.keras.layers.TextVectorization accepting standardization=None?

I'm still trying to get this work (and to learn!) so I am using a tiny corpus. I do some preprocessing on the text in order to get specific bi-gram collocations using nltk (not relevant here but I ...

DS14

131

asked Jun 27, 2024 at 10:39

1 vote

0 answers

1k views

How do I install the nltk library's "averaged_perceptron_tagger" on railway server?

Hi I am building an API with django REST Framework for generating a PowerPoint slide using python pptx package. I'm also using NLTK(Natural Language Toolkit) library to process text by tokenizing and ...

Ini-ubong Isemin

11

asked May 23, 2024 at 6:45

0 votes

1 answer

47 views

How to optimize this function and improve running time?

I have function aimed at creating a data-frame with three columns; bigram-phrase, count (of the bigram-phrase), and PMI score (for the bigram-phrase). Since I want to run this on a large dataset with ...

98fly

41

asked May 10, 2024 at 15:18

0 votes

2 answers

330 views

Extracting only technical keywords from a text using RAKE library in Python

I want to use rake to extract technical keywords from a job description that I've found on Linkedin, which looks like this: input = "In-depth understanding of the Python software development ...

Fatemeh

3

asked Apr 28, 2024 at 5:52

1 vote

1 answer

43 views

How can i get the first content of a python synsets list?

enter image description hereI have a scrapped text stored under the variable "message". I have removed the StopWords and stored the result with the variable "without_stop_words". I ...

Abuchi

67

asked Mar 23, 2024 at 14:53

-1 votes

1 answer

181 views

removing paywall language from piece of text (pandas) [closed]

I'm trying to do some preprocessing on my dataset. Specifically, I'm trying to remove paywall language from the text (in bold below) but I keep getting an empty string as my output. Here is the sample ...

Yves

47

asked Mar 15, 2024 at 6:30

Collectives™ on Stack Overflow

All Questions

Count of Combination of bigrams

How do I remove escape characters from output of nltk.word_tokenize?

I am getting error while running this line of code gnb.fit(df_train, y_train)

Issues with nltk's ne_chunk

Getting all leaf words (reverse stemming) into one Python List

Determining most popular words in the English dictionary within a dictionary of words

How to extract specific entities from unstructured text

How to parse multiple chunks in nltk?

Unable to use nltk functions

why isn't tf.keras.layers.TextVectorization accepting standardization=None?

How do I install the nltk library's "averaged_perceptron_tagger" on railway server?

How to optimize this function and improve running time?

Extracting only technical keywords from a text using RAKE library in Python

How can i get the first content of a python synsets list?

removing paywall language from piece of text (pandas) [closed]

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags