Questions tagged [tsne]
t-SNE (t-distributed stochastic neighbor embedding) is a technique for dimensionality reduction.
69 questions
2
votes
2
answers
192
views
What are the fastest dimensional reduction techniques to use out of the box
I am working on an ML project where we would like to visualize movements in a high-dimensional but sparse vector space (e.g. a 1x75 vector where most of the entries are either one-hot encoded binary ...
0
votes
1
answer
228
views
How to preprocess/encode categorical data, to use in dimensionality reduction and clustering algorithms?
I am working on a project witht the goal of clustering participants of in a survey according to their answers. The dataset is a set of 63 questions, some nominal and some ordinal. How should I encode ...
2
votes
0
answers
27
views
Finding parameters which reveal clustering in t-SNE
These data are from SAMHSA, Mental Health Client-Level Data. I am trying to find the right parameters to obtain clustering as in this paper. Code here.
For now, I'm dropping columns which aren't ...
0
votes
1
answer
218
views
TSNE plots of random data subsets are vastly different but labels are still clearly separated - what conclusions can we draw about the dataset?
I scraped a dataset of match data in a video game and labeled them according to their outcome (0 for loss, 1 for win). I wanted to see if there was actually any inherent relationship between the ...
1
vote
0
answers
170
views
How to do search/cluster over a million points?
I've a practical question in the areas of clustering/semantic search and would like to get some thoughts. Refer the figure for more details on this hypothetical situation.
Imagine I've 2 query ...
0
votes
1
answer
90
views
How do I interpret low dimentional embeddings of high dimentional embeddings?
I am trying to understand what I am supposed to learn about a problem when using dimensionality reduction methods. In particular, I am referring to methods like t-SNE and UMAP.
For the most part I am ...
0
votes
1
answer
3k
views
Dimension reduction of Word Embeddings: PCA vs. TSNE
I am pretty new to DS. I have a general question regarding the limitations of visualizing word embeddings using PCA.
I've learned so far that when using PCA (e.g. with ...
1
vote
2
answers
2k
views
Issues with audio embedding using wav2vec
I am having issues with audio embedding using the wav2vec library while trying to classify emotions using audio signals from the EMODB dataset (Emotions dataset in German). I am using the following ...
1
vote
2
answers
667
views
How to interpret differences between 2D and 3D T-SNE visualization of similar words from Word2Vec embedding?
I have created a Word2Vec model based on the transcript of the Office. I am now trying to visualize the embedding space for the top similar words of an input word with t-SNE in 2D and 3D. I ...
0
votes
1
answer
637
views
Does t-SNE have to result in clear clusters / structures?
I have a data set which, no matter how I tune t-SNE, won't end in clearly separate clusters or even patterns and structures. Ultimately, it results in arbitrary distributed data points all over the ...
2
votes
1
answer
630
views
Does make sense to use t-SNE and then applied HDBSCAN to cluster?
I believe that the title is self-contained. Does make sense to use t-SNE and then applied HDBSCAN to cluster the data with dimensionality reduction?
2
votes
1
answer
4k
views
Can t-SNE be applied to visualize time series datasets
I have multiple time-series datasets containing 9 IMU sensor features. Suppose I use the sliding window method to split all these data into samples with the sequence length of 100, i.e. the dimension ...
0
votes
0
answers
191
views
PCA vs t-SNE in asset pricing
So I am trying dimensionality reduction techniques on the S&P500 FY2020 data.
I understand the CAPM model and the fact that doing a PCA determines my market variability factor (the first PCA ...
1
vote
0
answers
28
views
t-SNE - how variance is set and how it affects dense vs sparse clusters in HD space
When learning about t-SNE, I found a resource saying "width of the normal curve (a gaussian centered at $x_i$) depends on the density of data near the point of interest". Which is why we do ...
2
votes
1
answer
1k
views
Visualizing outliers using T-SNE
I'm trying to visualize outliers in my data using T-SNE and it seems like the outliers appear as three different clusters. The original data has 7 different columns but I chose to plot the outliers on ...