Skip to main content

Questions tagged [clustering]

Cluster analysis is the task of partitioning data into subsets of objects according to their mutual "similarity," without using preexisting knowledge such as class labels. [Clustered-standard-errors and/or cluster-samples should be tagged as such; do NOT use the "clustering" tag for them.]

0 votes
0 answers
23 views

I have 3 months of categorized bank transaction data and need to identify recurring cash inflows and outflows for lending risk modeling. Complications: 1. Income dates shift earlier when payday falls ...
Awande Ntombela's user avatar
0 votes
0 answers
33 views

In a recent bioinformatics paper, the authors describe a statistical/machine learning approach to classify clusters of cells using kernel density estimation (KDE) and Z-scores. While the details of ...
Michiel.Tawdarous's user avatar
1 vote
1 answer
50 views

Suppose I have two multi-dimensional population samples - $A$ and $B$. I hypothesise that $\mathbb{E}[A]$ and $\mathbb{E}[B]$ are orthogonal in this high-dimensional space. To test this hypothesis, I ...
sunnydk's user avatar
  • 127
1 vote
0 answers
32 views

I have an interesting problem I am trying to solve and I cannot find any non-deep methods available to solve it. Problem Description Plain The real life problem this relates to are handwritten digits ...
Ryan Folks's user avatar
2 votes
1 answer
46 views

I am trying to subset data based on a pattern of "strings" or clusters of food deliveries to young that I see in my data (see plots labeled 2, 4, 5, 6, and 8 in the figure below for the most ...
thegrayson's user avatar
0 votes
0 answers
27 views

I'd appreciate your thoughts on the following problem. I've created a heatmap plot (attached) showing the cluster membership ratio for each participant (in separate subplots) and condition (η). Now, I'...
maria mystakidou's user avatar
2 votes
1 answer
122 views

I am new to working with country-level effects in comparative OLS regression with individual-level data. Are there any good resources for this? Suppose my dependent variable is social integration (an ...
Olestan's user avatar
  • 71
0 votes
0 answers
44 views

I am currently working on the project where I need to assign customers across N recipes before AB testing such that KPIs for each customer are balanced across recipes (reduce pre-test bias) Dataset ...
Rishab's user avatar
  • 1
0 votes
0 answers
57 views

I am currently working on clustering continuous variables (such as AOV, RPV, and conversions(conversion/visits)). The variables are heavily right skewed with long tails and one variable is dominated ...
Rishab's user avatar
  • 1
3 votes
1 answer
129 views

I would like to perform clustering with a finite Gaussian Mixture model, however, I have missing data (some features are missing at random). I am using Variational Inference to fit my Bayesian GMM. Is ...
Tom's user avatar
  • 1,112
2 votes
0 answers
72 views

I am generating clustering data using the Bayesian mixture of Gaussian models described in Bishop's Pattern Recognition and Machine Learning textbook, with model parameters drawn from the following ...
PJB's user avatar
  • 21
1 vote
1 answer
59 views

I have a 5-variable/3 category-level ordinal survey data set. E.g. 5 health variables ranked 1-3 (good-moderate-poor). I want to row-cluster different responses. But also, I want determine whether ...
EB3112's user avatar
  • 264
1 vote
0 answers
54 views

When applying k-means clustering, I understand that the goal is to partition the dataset by assigning each point to its nearest cluster center. However, I’ve come across statements that k-means can be ...
EngineerMathlover's user avatar
1 vote
0 answers
72 views

I've recently learnt unsupervised learning methods such as KMeans and DBSCAN. While working on this dataset, I applied KMeans clustering but faced the following issues: The Elbow Method showed no ...
ssmalik's user avatar
  • 41
0 votes
1 answer
60 views

My project has the following steps: Use elbow method to determine the features and number of clusters for kmeans. Run kmeans on the data (with determined features and n clusters), and gives the ...
Xin Niu's user avatar
  • 103

15 30 50 per page
1
2 3 4 5
270