Skip to main content

All Questions

0 votes
1 answer
95 views

GloVe embedding for empty string

It looks like the embedding for the empty string in the glove.twitter.27B.200d.txt file that's part of this zip file: https://nlp.stanford.edu/data/glove.twitter.27B.zip is provided on line 38523, but ...
Michael Szczepaniak's user avatar
2 votes
1 answer
640 views

Cannot download GloVe embeddings. Have they been moved or is downloads.cs.stanford.edu down temporarily?

I am attempting to download glove.840B.300d.zip. I used the link at https://nlp.stanford.edu/projects/glove/ and also ran wget https://nlp.stanford.edu/data/glove.840B.300d.zip. The output from wget ...
anna-martin's user avatar
0 votes
1 answer
54 views

Can we deduce the relationship b/w a dimension of a word vector with the linguistic characteristic it represents?

Let's imagine we generated a 200 dimension word vector using any pre-trained model of the word ('hello') as shown in the below image. So, by any means can we tell which linguistic feature is ...
Rajesh's user avatar
  • 13
1 vote
0 answers
134 views

Invalid argument: Input to reshape is a tensor with 14155776 values, but the requested shape has 262144

I am trying to use ELMO embedding to train my Network with LSTM but i have a problem with the shape of the tensor y-train with shape (67689, 5) encoded with 1 hot vector (the output is 5 classes) x-...
rawaa Alatrash's user avatar
0 votes
0 answers
178 views

Glove Model Taking lots of time to give n similar words

I have a list of token and i am trying to find the top 10 similar word for each token but Glove model is taking a lot of time to return similar word.My code is: class GloveModel: def __init__(self)...
Hustler's user avatar
  • 15
0 votes
2 answers
108 views

Pretrained Word Embeddings For Each Year

I am running a task where it would be nice to have different versions of word embeddings across different time periods e.g. embeddings for 2013, 2014, 2015, 2016 ... 2020. This is because I don't want ...
jim travis's user avatar
2 votes
1 answer
3k views

Deal with Out of vocabulary word with Gensim pretrained GloVe

I am working on an NLP assignment and loaded the GloVe vectors provided by Gensim: import gensim.downloader glove_vectors = gensim.downloader.load('glove-twitter-25') I am trying to get the word ...
nico_so's user avatar
  • 167
0 votes
1 answer
681 views

Word2Vec- does the word embedding change?

just wanted to know if there are 2 sentences- The bank remains closed on public holidays Don't go near the river bank The word 'bank' will have different word embeddings or same? If we use word2vec ...
user avatar
1 vote
0 answers
407 views

How to update GloVe models?

In my work, I used my own corpus to train a Word2Vec model using gensim. Then I used several small corpus to "update" that model (producing different sets of vectors). This process well ...
Imrul Huda's user avatar
0 votes
1 answer
835 views

Word vocabulary generated by Word2vec and Glove models are different for the same corpus

I'm using CONLL2003 dataset to generate word embeddings using Word2vec and Glove. The number of words returned by word2vecmodel.wv.vocab is different(much lesser) than glove.dictionary. Here is the ...
vid1505's user avatar
  • 43
2 votes
1 answer
425 views

How to get feature names for a glove vectors

Countvectorizer has feature names, like this. vectorizer = CountVectorizer(min_df=10,ngram_range=(1,4), max_features=15000) vectorizer.fit(X_train['essay'].values) # fit has to happen only on train ...
Shaurya Sheth's user avatar
0 votes
1 answer
210 views

How to concatenate Glove 100d embedding and 1d array which contains additional signal?

I new to NLP and trying out some text classification algorithms. I have 100d GloVe vector representing each entry as a list of embeddings. Also, I have NER feature of shape (2234,) which shows if ...
sumixam's user avatar
  • 77
1 vote
0 answers
2k views

How to use GloVe word embedding for non-English text

I am trying to run a GloVe word embedding on a Bengali news dataset. Now the original GloVe source doesn't have any supported language other than English but I found this which has word vectors ...
afsara_ben's user avatar
1 vote
1 answer
1k views

How to compare cosine similarities across three pretrained models?

I have two corpora - one with all women leader speeches and the other with men leader speeches. I would like to test the hypothesis that cosine similarity between two words in the one corpus is ...
SanMelkote's user avatar
0 votes
1 answer
1k views

Understanding usage of glove vectors

I used the following code to using glove vectors for word embeddings from gensim.scripts.glove2word2vec import glove2word2vec #line1 glove_input_file = 'glove.840B.300d.txt' #line2 ...
Karanam Krishna's user avatar

15 30 50 per page