Skip to main content

Questions tagged [clustering]

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval etc.

5 votes
1 answer
69 views

I am doing a binary classification modelling project - and was wondering if running clustering on numeric features to create groupings as another categorical feature, would be beneficial, under a ...
user54565's user avatar
  • 115
7 votes
1 answer
90 views

My dataset consists of board games data: each board game is rated with a categorical variable (low, medium, high). I've plotted the LDA projection to check whether classes are linearly separable. The ...
Giulio Lanza's user avatar
3 votes
1 answer
45 views

I have a large set of document embeddings, and I would like to sample a subset where the median or average pairwise distance is maximized. The idea here is to get a more balanced sample set where long ...
Layman's user avatar
  • 291
6 votes
1 answer
115 views

I am trying to automatically extract clusters by density for image embeddings for exploratory analysis. Idea is finding repeating patterns in my dataset, which can be very specific or more general; ...
Layman's user avatar
  • 291
1 vote
0 answers
38 views

I'd appreciate your thoughts on the following problem. I've created a heatmap plot (attached) showing the cluster membership ratio for each participant (in separate subplots) and condition (η). Now, I'...
maria mystakidou's user avatar
1 vote
0 answers
48 views

How can I visualise a hierarchical ontology of items in embedding space, combining text embeddings with the graphical structure? (Something similar to the example below) I have a hierarchical ...
baked goods's user avatar
7 votes
1 answer
304 views

I am following the example code in the linkage documentation: ...
user2153235's user avatar
7 votes
1 answer
133 views

I am educating myself on hierarchical clustering and the relevant SciPy methods. The 1st argument of the linkage method is a 1D condensed distance matrix $X$ of ...
user2153235's user avatar
3 votes
1 answer
89 views

I am working on a cluster analysis. I have 4 clusters with about 35,000 datapoints. I got relatively strong clusters. I am in marketing and this is for segmentation. One of these clusters has a very ...
David Orndorf's user avatar
7 votes
1 answer
148 views

SciPy's fclusterdata requires the coordinates of M points in N dimensional space (or M observations of N dimensions each). My data is in the form of pairwise ...
user2153235's user avatar
2 votes
0 answers
67 views

Problem description I have a dataset which is a combination of multiple sources gathering the same kind of data. I have retrieved those data to fit them into several columns of a pandas dataframe. All ...
patacoing's user avatar
202 votes
13 answers
319k views

My data set contains a number of numeric attributes and one categorical. Say, NumericAttr1, NumericAttr2, ..., NumericAttrN, CategoricalAttr, where ...
IgorS's user avatar
  • 5,484
6 votes
2 answers
525 views

We have ~30 audio snippets, of which around 50% are from the same speaker, who is our target speaker, and the rest are from various different speakers. We want to extract all audio snippets from our ...
Yes's user avatar
  • 201
3 votes
0 answers
69 views

I have monthly sales data from a set of online merchants that sell on an online shop using a cloud-based software solution. The data look something like this: month merchant_id shop_id shop_country ...
Max's user avatar
  • 31
2 votes
1 answer
100 views

I already have a GLM model in place to predict claims frequency. I know have access to many new variables (a mix of categorical and continuous variables, some of which are likely correlated). I wish ...
InsurancePricer's user avatar

15 30 50 per page
1
2 3 4 5
92