Newest 'nlp+stanford-nlp+python' Questions

0 votes

1 answer

220 views

Stanford Stanza sometimes splits a sentence into two sentences

I am using stanza 1.6.1. I have been experimenting with Stanza's constituency parser. In certain cases it splits a sentence into 2 Sentence objects. For example, take this sentence : Pull up Field ...

zaki41

81

asked Jan 18, 2024 at 15:41

0 votes

1 answer

435 views

How to make stanza lemmatizer to return just the lemma instead of a dictionary?

I'm implementing stanza's lemmatizer because it works well with spanish texts but the lemmatizer retuns a whole dictionary with ID and other characteristics I don't care about for the time being. I ...

trashparticle

13

asked Dec 5, 2023 at 23:30

0 votes

0 answers

257 views

How to speed up Stanza lemmatizer by excluding reduntant words

Given: I have a small sample document with limited number of words as follows: d =''' I go to school by the school bus everyday with all of my best friends. There are several students who also take ...

doplano

1,601

asked Jul 14, 2023 at 11:31

0 votes

1 answer

441 views

Calculating similarity score in contexto.me clone

I am currently trying to clone the popular browser game contexto.me and I am having trouble with as to how to calculate the similarity score between two words (the target word and the user inputted ...

FarajSiddique

3

asked Jul 3, 2023 at 0:44

-1 votes

1 answer

226 views

Best libraries to classify misclassified categories?

I have a datset of over 50k rows and around 40% of the categories are misclassified categories and I want to use natural language processing to re-classify them using variables that are mostly binary ...

wageeh

84

asked Jun 7, 2023 at 13:38

0 votes

2 answers

266 views

What is Stanford CoreNLP's recipe for tokenization?

Whether you're using Stanza or Corenlp (now deprecated) python wrappers, or the original Java implementation, the tokenization rules that StanfordCoreNLP follows is super hard for me to figure out ...

lrthistlethwaite

546

asked Apr 11, 2023 at 20:16

1 vote

1 answer

355 views

How to get original token position in string from Stanza constituency parse tree?

I am using Stanza to extract noun phrases from texts. I am using this code to extract the NPs and store them according to their depth. nlp = stanza.Pipeline('en', tokenize_pretokenized=True) ...

kachap

11

asked Jan 30, 2023 at 16:05

1 vote

1 answer

167 views

NLP task of arranging words in the correct order?

Is there any state-of-the-art deep learning model that can acomplish the task of arranging a bunch of words in the correct order? For example, Input: boy that killed have must they Expected output: ...

LoUso DeBasura

23

asked Jan 3, 2023 at 12:23

0 votes

1 answer

256 views

Stanford's Stanza NLP: find all words ids for a given span

I am using a Stanza pipeline that extracts both words and named entities. The sentence.entities gives me a list of recognized named entities with their start and end characters. Here is an example: { ...

Robert Alexander

1,211

asked Dec 18, 2022 at 17:37

2 votes

0 answers

225 views

Stanford Stanza NLP to networkx: superimpose NER entities onto graph of words

Here is a sample program which will take a text (example is in italian but Stanza supports many languages) and builds and displays a graph of the words (only certain Parts of Speech) and their ...

Robert Alexander

1,211

asked Dec 16, 2022 at 16:49

1 vote

1 answer

493 views

Obtaining data from both token and word objects in a Stanza Document / Sentence

I am using a Stanford STANZA pipeline on some (italian) text. Problem I'm grappling with is that I need data from BOTH the Token and Word objects. While I'm able to access one or the other separately ...

Robert Alexander

1,211

asked Dec 3, 2022 at 15:39

1 vote

1 answer

72 views

Retain original document element index of argument passed through sklearn's CountVectorizer() in order to access corresponding part of speech tag

I have a data frame with sentences and the respective part of speech tag for each word (Below is an extract of the data I'm working with (data taken from SNLI corpus). For each sentence in my ...

OLGJ

452

asked Nov 29, 2022 at 8:39

1 vote

0 answers

1k views

Break Complex/Compound Sentences into Simple Sentences using NLP

I want to break sentences i.e complex/compound sentences, the sentences that are larger in size and consist of more than two sentences. for eg: I like to eat apples but I hate apple juice. Here the ...

DevPy

497

asked Nov 4, 2022 at 12:18

1 vote

0 answers

90 views

NLP / ML Python: variation of topic modeling + summarization? Can someone point me in the right direction?

New to NLP and Machine learning. Wondering if someone can point me in the right direction: I'm looking to create a function that takes 2 inputs. -an array of strings (english sentences of varying ...

dv151

115

asked Sep 29, 2022 at 17:16

1 vote

1 answer

2k views

Extracting country name from an address

I've a large dataset with an address column. I would like to extract the countries from the address. In many cases, the address column contains states, cities, and zip code, but the country names. ...

kaloon

177

asked Jun 26, 2022 at 19:09

Collectives™ on Stack Overflow

All Questions

Stanford Stanza sometimes splits a sentence into two sentences

How to make stanza lemmatizer to return just the lemma instead of a dictionary?

How to speed up Stanza lemmatizer by excluding reduntant words

Calculating similarity score in contexto.me clone

Best libraries to classify misclassified categories?

What is Stanford CoreNLP's recipe for tokenization?

How to get original token position in string from Stanza constituency parse tree?

NLP task of arranging words in the correct order?

Stanford's Stanza NLP: find all words ids for a given span

Stanford Stanza NLP to networkx: superimpose NER entities onto graph of words

Obtaining data from both token and word objects in a Stanza Document / Sentence

Retain original document element index of argument passed through sklearn's CountVectorizer() in order to access corresponding part of speech tag

Break Complex/Compound Sentences into Simple Sentences using NLP

NLP / ML Python: variation of topic modeling + summarization? Can someone point me in the right direction?

Extracting country name from an address

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags