All Questions
33 questions
0
votes
1
answer
95
views
GloVe embedding for empty string
It looks like the embedding for the empty string in the glove.twitter.27B.200d.txt file that's part of this zip file:
https://nlp.stanford.edu/data/glove.twitter.27B.zip
is provided on line 38523, but ...
2
votes
1
answer
640
views
Cannot download GloVe embeddings. Have they been moved or is downloads.cs.stanford.edu down temporarily?
I am attempting to download glove.840B.300d.zip. I used the link at https://nlp.stanford.edu/projects/glove/ and also ran wget https://nlp.stanford.edu/data/glove.840B.300d.zip. The output from wget ...
0
votes
1
answer
54
views
Can we deduce the relationship b/w a dimension of a word vector with the linguistic characteristic it represents?
Let's imagine we generated a 200 dimension word vector using any pre-trained model of the word ('hello') as shown in the below image.
So, by any means can we tell which linguistic feature is ...
1
vote
0
answers
134
views
Invalid argument: Input to reshape is a tensor with 14155776 values, but the requested shape has 262144
I am trying to use ELMO embedding to train my Network with LSTM but i have a problem with the shape of the tensor
y-train with shape (67689, 5) encoded with 1 hot vector (the output is 5 classes)
x-...
0
votes
0
answers
178
views
Glove Model Taking lots of time to give n similar words
I have a list of token and i am trying to find the top 10 similar word for each token but Glove model is taking a lot of time to return similar word.My code is:
class GloveModel:
def __init__(self)...
0
votes
2
answers
108
views
Pretrained Word Embeddings For Each Year
I am running a task where it would be nice to have different versions of word embeddings across different time periods e.g. embeddings for 2013, 2014, 2015, 2016 ... 2020. This is because I don't want ...
2
votes
1
answer
3k
views
Deal with Out of vocabulary word with Gensim pretrained GloVe
I am working on an NLP assignment and loaded the GloVe vectors provided by Gensim:
import gensim.downloader
glove_vectors = gensim.downloader.load('glove-twitter-25')
I am trying to get the word ...
0
votes
1
answer
681
views
Word2Vec- does the word embedding change?
just wanted to know if there are 2 sentences-
The bank remains closed on public holidays
Don't go near the river bank
The word 'bank' will have different word embeddings or same? If we use word2vec ...
1
vote
0
answers
407
views
How to update GloVe models?
In my work, I used my own corpus to train a Word2Vec model using gensim. Then I used several small corpus to "update" that model (producing different sets of vectors). This process well ...
0
votes
1
answer
835
views
Word vocabulary generated by Word2vec and Glove models are different for the same corpus
I'm using CONLL2003 dataset to generate word embeddings using Word2vec and Glove.
The number of words returned by word2vecmodel.wv.vocab is different(much lesser) than glove.dictionary.
Here is the ...
2
votes
1
answer
425
views
How to get feature names for a glove vectors
Countvectorizer has feature names, like this.
vectorizer = CountVectorizer(min_df=10,ngram_range=(1,4), max_features=15000)
vectorizer.fit(X_train['essay'].values) # fit has to happen only on train ...
0
votes
1
answer
210
views
How to concatenate Glove 100d embedding and 1d array which contains additional signal?
I new to NLP and trying out some text classification algorithms. I have 100d GloVe vector representing each entry as a list of embeddings. Also, I have NER feature of shape (2234,) which shows if ...
1
vote
0
answers
2k
views
How to use GloVe word embedding for non-English text
I am trying to run a GloVe word embedding on a Bengali news dataset. Now the original GloVe source doesn't have any supported language other than English but I found this which has word vectors ...
1
vote
1
answer
1k
views
How to compare cosine similarities across three pretrained models?
I have two corpora - one with all women leader speeches and the other with men leader speeches. I would like to test the hypothesis that cosine similarity between two words in the one corpus is ...
0
votes
1
answer
1k
views
Understanding usage of glove vectors
I used the following code to using glove vectors for word embeddings
from gensim.scripts.glove2word2vec import glove2word2vec #line1
glove_input_file = 'glove.840B.300d.txt' #line2
...