Skip to main content

Questions tagged [similarity]

3 votes
1 answer
45 views

I have a large set of document embeddings, and I would like to sample a subset where the median or average pairwise distance is maximized. The idea here is to get a more balanced sample set where long ...
Layman's user avatar
  • 291
2 votes
0 answers
43 views

If we want to implement RAG for large dataset, which similarity works? Why? Also, how to handle problem with size of matrix in cosine similarity?
user10296606's user avatar
  • 1,906
1 vote
0 answers
71 views

I’m designing a recommendation feature for a student internship platform. Students will explicitly select their interests and skills during registration, and recruiters will post internship ...
Amira's user avatar
  • 11
4 votes
1 answer
209 views

I have two word lists, where each word is representative of each topic. A topic is created from a collection of documents (tweets in this case). Not all words would’ve appeared an equal number of ...
Adam_G's user avatar
  • 141
3 votes
1 answer
79 views

Consider a flight as represented by a dataframe with spatial (latitude, longitude, altitude) ...
Droid's user avatar
  • 131
0 votes
1 answer
123 views

I would like to know what is the difference between synchrony and similarity w.r.t time series data. Upon research I get the below explanation. "Synchrony and similarity are two different ...
mohammed shoab's user avatar
0 votes
1 answer
309 views

I have a RAG pipeline where I want to extract a piece of information called "X" In a regular RAG pipeline, there is a query entered by the user. Then, ...
ahmedmoh123's user avatar
1 vote
0 answers
80 views

If I have two documents, D1 and D2 and a function f which computes the (normalized) document ...
CutePoison's user avatar
1 vote
0 answers
72 views

I train a BI-Encoder to get an Augmented SBERT and I get a final training result. How can I interpret the following output of the final training result? ...
Christian01's user avatar
4 votes
1 answer
470 views

I am looking for the correct model / approach for the task of checking if two sentences have the same meaning I know I can use embeddings to check similarity, but that is not what I am after. I ...
Rob Audenaerde's user avatar
0 votes
1 answer
76 views

Say you have a problem where you have a query and a set of result documents and you want to rank the result documents according to the query. Say also you have embeddings for the query and for the ...
user1893354's user avatar
0 votes
1 answer
158 views

We have different pre-trained models like BERT, USE, ELMo, Word2Vec, FastText, etc.., we have documents in different sizes (large, medium, small). now, we want to do document similarity. how can we ...
tovijayak's user avatar
6 votes
2 answers
2k views

I think this question is one that many beginners run into and I could not find a decent generic guide for it. My issue is the following. I want to evaluate similarity of vectors which have mixed data ...
Chapo's user avatar
  • 63
0 votes
0 answers
132 views

I try to cluster similar support-tickets in a technical domain. The support tickets are very domain-specific and are written in various styles, lengths, using abbreviation, etc. I made a training-...
Roland's user avatar
  • 1
2 votes
1 answer
50 views

I have an "image" of NxN dimensions in m channels (for reference, m is less than 17) in my training set and validation set. I would like to compare images in the training set with those in ...
Shaz's user avatar
  • 135

15 30 50 per page
1
2 3 4 5
19