Skip to main content

All Questions

Tagged with
0 votes
0 answers
44 views

Count of Combination of bigrams

I have create a dataset as follows using bigrams index | product_action -------------------------------------------------------| ('customer', 'called') | action ('customer', '...
Jane's user avatar
  • 1
1 vote
1 answer
69 views

How do I remove escape characters from output of nltk.word_tokenize?

How do I get rid of non-printing (escaped) characters from the output of the nltk.word_tokenize method? I am working through the book 'Natural Language Processing with Python' and am following the ...
green_ruby's user avatar
1 vote
0 answers
28 views

I am getting error while running this line of code gnb.fit(df_train, y_train)

Title: ValueError: could not convert string to float when training GaussianNB for SMS Spam Detection Body: I'm building an SMS spam detection tool and encountering an error while predicting with a ...
Aditya Kumar's user avatar
0 votes
0 answers
77 views

Issues with nltk's ne_chunk

I have been trying to use nltk's entity chunker, and tried different approaches but I keep getting the error: LookupError Traceback (most recent call last) ... ...
Sarah Tomori's user avatar
1 vote
1 answer
55 views

Getting all leaf words (reverse stemming) into one Python List

On the same lines as the solution provided in this link, I am trying to get all leaf words of one stem word. I am using the community-contributed (@Divyanshu Srivastava) package get_word_forms Imagine ...
JodeCharger100's user avatar
0 votes
2 answers
74 views

Determining most popular words in the English dictionary within a dictionary of words

Forgive me if my wording is awful, but I'm trying to figure out how to determine the most used words in the English language from a set of words in a dictionary I've made. I've done some research on ...
Harrison Sills's user avatar
1 vote
1 answer
107 views

How to extract specific entities from unstructured text

Given a generic text sentence (in a specific context) how can I extract word/entities of interest belonging to a specific "category" using python and any NLP library? For example given a ...
Riccardo Raffini's user avatar
1 vote
0 answers
47 views

How to parse multiple chunks in nltk?

is it possible to parse multiple chunks in a single nltk.regexp parser? can grammar have multiple chunks define like this? def parser(s): grammar = """ NP: {<DT>?<JJ>...
konto's user avatar
  • 11
7 votes
2 answers
11k views

Unable to use nltk functions

I was trying to run some nltk functions on the UCI spam message dataset but ran into this problem of word_tokenize not working even after downloading dependencies. import nltk nltk.download('punkt') ...
Utsav Jana's user avatar
0 votes
0 answers
59 views

why isn't tf.keras.layers.TextVectorization accepting standardization=None?

I'm still trying to get this work (and to learn!) so I am using a tiny corpus. I do some preprocessing on the text in order to get specific bi-gram collocations using nltk (not relevant here but I ...
DS14's user avatar
  • 131
1 vote
0 answers
1k views

How do I install the nltk library's "averaged_perceptron_tagger" on railway server?

Hi I am building an API with django REST Framework for generating a PowerPoint slide using python pptx package. I'm also using NLTK(Natural Language Toolkit) library to process text by tokenizing and ...
Ini-ubong Isemin's user avatar
0 votes
1 answer
47 views

How to optimize this function and improve running time?

I have function aimed at creating a data-frame with three columns; bigram-phrase, count (of the bigram-phrase), and PMI score (for the bigram-phrase). Since I want to run this on a large dataset with ...
98fly's user avatar
  • 41
0 votes
2 answers
330 views

Extracting only technical keywords from a text using RAKE library in Python

I want to use rake to extract technical keywords from a job description that I've found on Linkedin, which looks like this: input = "In-depth understanding of the Python software development ...
Fatemeh's user avatar
1 vote
1 answer
43 views

How can i get the first content of a python synsets list?

enter image description hereI have a scrapped text stored under the variable "message". I have removed the StopWords and stored the result with the variable "without_stop_words". I ...
Abuchi's user avatar
  • 67
-1 votes
1 answer
181 views

removing paywall language from piece of text (pandas) [closed]

I'm trying to do some preprocessing on my dataset. Specifically, I'm trying to remove paywall language from the text (in bold below) but I keep getting an empty string as my output. Here is the sample ...
Yves's user avatar
  • 47

15 30 50 per page
1
2 3 4 5
118