All Questions
345 questions
0
votes
1
answer
66
views
similarity from word to sentence after doing words Embedding
I have dataframe with 1000 text rows.
I did word2vec .
Now I want to create a new field which give me the distance from each sentence to the word that i want, lets say the word "king".
I ...
0
votes
0
answers
68
views
Why Is My Skip-Gram Implementation Producing Incorrect Results?
I'm implementing a Skip-Gram model for Word2Vec using Python. However, my model doesn't seem to be working correctly, as indicated by the resulting embeddings and their visualization. Here is an ...
0
votes
2
answers
1k
views
Is it possible to fine-tune a pretrained word embedding model like vec2word?
I'm working on semantic matching in my search engine system. I saw that word embedding can be used for this task. However, my dataset is very limited and small, so I don't think that training a word ...
0
votes
2
answers
75
views
Output of Cosine Similarity is not as expected
I am trying to generate the Cosine similarity between two words in a sentence. The sentence is "The black cat sat on the couch and the brown dog slept on the rug".
My Python code is below:
...
0
votes
0
answers
82
views
How do I split words effectively through TextVectorization function?
Here is the custom_standardize function I'm using for my task.
def custom_standardization(input_data):
# Lowercase the text and remove punctuation
stop_words = set(stopwords.words('...
0
votes
1
answer
59
views
Word2Vec to calculate similarity of movies to high preforming movies
I have a dataset with user ratings for movies and movie descriptions like this
import pandas as pd
df =pd.DataFrame ({
'description': [
'Two imprisoned men bond over a number of years',
...
0
votes
1
answer
588
views
Generating Vector Embeddings for Organization Names
I have seen couple of Word2Vec Models that can generate embeddings for Company Names, and performs well when the different formats of the same company names are given.
But what I want to do is a bit ...
0
votes
1
answer
64
views
shape of my dataframe(#rows) and that of final embeddings array doesn't match
I generated the word embeddings for my corpus(2-D List) then tried to generate the Average Word2Vec embeddings for each of the individual word list(that is for each comment which have been converted ...
1
vote
0
answers
182
views
Output of cosine_similarity() not as expected (all values equal to 1.)
I have been trying to find cosine similarity between the vector representation of the tag of a movie (computed using average word2vec) and all the other movies's vector representation (also using avg ...
-1
votes
1
answer
105
views
Trouble getting my gradient descent algorithm to converge (word2vec)
Just for the sake of practising, learning and experimenting, I made a word2vec model from scratch, using the formulas and algorithm of Jurafsky & Martin. Here is the code:
class Word2Vec:
def ...
0
votes
1
answer
104
views
Word Vector Features preprocessing for ML
I'm training a classifier, where the input is a 300-dimensional word vector.
Usually in machine learning problems, I would scale my inputs to 0-mean and unit variance.
However, scaling the vector ...
0
votes
1
answer
79
views
ValueError when setting array element with a sequence in Python
I'm working on a nlp project and trying to train an LSTM model for sentiment analysis using pre-trained word embeddings. However, I'm facing a ValueError when trying to assign word embeddings to a ...
0
votes
1
answer
63
views
Why accuracy is 0%
https://github.com/Saranja-Navaneethakumar/WSD_Skipgram/blob/main/skipgram.py
I'm doing word sense disambiguation with word2vec skipgaram model and train & test datasets - 2017 SemEval benchmark ...
0
votes
0
answers
170
views
Removing words from FastText Model / Converting a .vec file to a .bin file (vec2bin)
I am working with FastText on a language (Tamil) and a task where I don't expect to encounter and simply don't care about character/words from other languages. I have both the text (.vec) and binary (....
0
votes
1
answer
193
views
How to interpret word2vec train output?
Running the code snippet below report an output (3, 60). I wonder what exactly it is reporting?
The code is reproducible..just copy into a notebook cell and run.
from gensim.models import Word2Vec
...