All Questions
Tagged with nlp stanford-nlp
1,412 questions
2
votes
0
answers
39
views
How to apply semantic tokenize on sentence in java by NLP?
Can an NLP model be used to tokenize a sentence based on its semantic meaning?
For example,
for the sentence: If the driver's age is more than 20,
the tokens would be:
Token1: if
Token2: driver age
...
0
votes
0
answers
26
views
Is there a method to load caseless models to Stanford's NLP sentiment analysis?
In the Stanford documentation, the authors mention using caseless models to process case-insensitive text. Namely the ability to load the GATE Twitter POS annotator. It is a POS annotator, but it ...
0
votes
1
answer
95
views
GloVe embedding for empty string
It looks like the embedding for the empty string in the glove.twitter.27B.200d.txt file that's part of this zip file:
https://nlp.stanford.edu/data/glove.twitter.27B.zip
is provided on line 38523, but ...
0
votes
1
answer
219
views
Stanford Stanza sometimes splits a sentence into two sentences
I am using stanza 1.6.1. I have been experimenting with Stanza's constituency parser.
In certain cases it splits a sentence into 2 Sentence objects. For example, take this sentence : Pull up Field ...
0
votes
1
answer
434
views
How to make stanza lemmatizer to return just the lemma instead of a dictionary?
I'm implementing stanza's lemmatizer because it works well with spanish texts but the lemmatizer retuns a whole dictionary with ID and other characteristics I don't care about for the time being. I ...
0
votes
1
answer
326
views
Word2Vec - to be trained on train data or whole data
I wish to create a word2vec model and want to train it on my local data. so, the question is, should I train word2vec model on my whole data or should I split the data into train and test and then ...
-2
votes
1
answer
107
views
Is there a method to extract quotes and their related speakers in the French language?
Is there a method to extract quote and their related speaker with the gestion of coreference?
I want in output to get a dict with [{"speaker" : , "quotes": }] and if we don’t find ...
0
votes
1
answer
85
views
How to get Enhanced++ dependency labels with a java command line in the terminal?
I don't really know java, but I was just trying to use the documentation of the Stanford NLP parser to get the Enhanced++ dependency labels. This is the line I ran:
java -cp "*" -Xmx2g edu....
1
vote
0
answers
344
views
Is there a way to load Word2Vec embeddings to ChromaDB?
I want to query for similar words using ChromaDB. For example, 'great' should return all the words that are similar to 'great', in most cases, it would be synonyms. For this, I would like to upload ...
0
votes
0
answers
257
views
How to speed up Stanza lemmatizer by excluding reduntant words
Given:
I have a small sample document with limited number of words as follows:
d ='''
I go to school by the school bus everyday with all of my best friends.
There are several students who also take ...
1
vote
0
answers
234
views
Stanford CoreNLP Help -- Cannot import edu.stanford.nlp.pipeline
I am trying to build an application in eclipse IDE for my resume (my first) and have ran into a problem in my main file, where I am trying to import edu.stanford.nlp.pipeline.*; and have been playing ...
0
votes
1
answer
441
views
Calculating similarity score in contexto.me clone
I am currently trying to clone the popular browser game contexto.me and I am having trouble with as to how to calculate the similarity score between two words (the target word and the user inputted ...
-1
votes
1
answer
224
views
Best libraries to classify misclassified categories?
I have a datset of over 50k rows and around 40% of the categories are misclassified categories and I want to use natural language processing to re-classify them using variables that are mostly binary ...
1
vote
1
answer
657
views
My gpt2 code generates a few correct words and then goes into a loop of generating the same sequence again and again
The following gpt2 code for sentence completion generates a few good sentences and then ends in a loop of repetitive sentences.
from transformers import GPT2LMHeadModel, GPT2Tokenizer ...
3
votes
1
answer
713
views
How can I find the cosine similarity between two song lyrics represented as strings?
My friends and I are doing an NLP project on song recommendation.
Context: We originally planned on giving the model a recommended song playlist that has the most similar lyrics based on the random ...