All Questions
24 questions
0
votes
1
answer
326
views
How do I freeze only some embedding indices with tied embeddings?
I found in Is it possible to freeze only certain embedding weights in the embedding layer in pytorch? a nice way to freeze only some indices of an embedding layer.
However, while including it in a ...
1
vote
0
answers
637
views
Is there a way to use CodeBERT to embed source code without natural language in input?
On CodeBERTS github they provide an example of using a NL-PL pair with the pretrained base model to create an embedding. I am looking to create an embedding using just source code which does not have ...
0
votes
1
answer
560
views
How to get Attentions Part from the output of a Bert model?
I am using Bert-Model for Query Expansion and I am trying to extract the keywords from the Document I have
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
model = BertModel....
2
votes
0
answers
160
views
use BERT word to vector embedding only on word, not sentence
How to use BERT word to vector embedding only on word, not sentence?
I have list of nouns and I need vector version of these words using BERT. I researched a lot on how to do it, but I could only ...
1
vote
1
answer
957
views
Calculate cosine similarity between 2 words using BERT
I am trying to calculate the cosine similarity beteween two given words using BERT, but I am getting an error which says:
IndexError: Dimension out of range (expected to be in range of [-1, 0], but ...
0
votes
1
answer
809
views
BERT word embeddings
I'm trying to use BERT in a static word embeddings kind of way to compare to Word2Vec and show the differences and how BERT is not really meant to be used in a contextless manner.
This is how (based ...
0
votes
1
answer
654
views
Multi-label Token Classification Using Contextual Embeddings For Each Word
I am trying to design a model for an argument mining task on a token-level basis. I have extracted contextual bert embeddings for each token and stored the embeddings in a dataframe which looks like ...
1
vote
0
answers
547
views
BERT embeddings + LSTM for NER
I am working with the Conll-2003 dataset for Named Entity Recognition. What I want to do is to use the BERT embeddings as an input to a simple LSTM. Here's the code:
class Model(nn.Module):
def ...
6
votes
1
answer
16k
views
How to get cosine similarity of word embedding from BERT model
I was interesting in how to get the similarity of word embedding in different sentences from BERT model (actually, that means words have different meanings in different scenarios).
For example:
sent1 =...
0
votes
1
answer
746
views
Search for Nearest Neighbours in Bert Embeddings
I am using the bert embeddings to generate similar words using this approach: https://gist.github.com/avidale/c6b19687d333655da483421880441950
It is working good for small dataset, but having problem ...
2
votes
1
answer
4k
views
Compare cosine similarity of word with BERT model
Hi I am looking to generate similar words for a word using BERT model, the same approach we use in gensim to generate most_similar word, I found the approach as:
from transformers import BertTokenizer,...
0
votes
0
answers
134
views
Word Embedding from BERT to Machine Learning Model Accuracy is not good
I am trying solving a product matching task by extracting the word embedding from BERT and feed the word embedding to Machine Learning models.
However, the accuracy is not good.
I tried different ...
3
votes
1
answer
4k
views
How to combine embeddins vectors of bert with other features?
I am working on a classification task with 3 labels (0,1,2 = neg, pos, neu). Data are sentences. So to produce vectors/embeddings of sentences, I use a Bert encoder to get embeddings for each sentence ...
0
votes
1
answer
2k
views
Cosine similarity between columns of two different DataFrame
I wanted to compute the cosine similarity between two DataFrame(for a different sizes) and store the result in the new data. The similarity is calculated using BERT embeddings
df1
title
Lorem ipsum ...
1
vote
0
answers
316
views
BERT: how to batch vectorize efficiently?
I am trying to convert the sentence into vector using BERT.
def bert_embedding(text):
# text: list of strings(sentences)
vector = []
for sentence in tqdm(text):
e = bert_tokenizer....