Questions tagged [k-means]
k-means is a family of cluster analysis methods in which you specify the number of clusters you expect. This is as opposed to hierarchical cluster analysis methods.
444 questions
0
votes
0
answers
11
views
Sales territory optimization
Hi Currently I am working on sale territory optimization. I am using kmean but can not deal with some constrain that are set by business such as planning territory to deal with constrain such as
max ...
0
votes
0
answers
79
views
What is the best practice to impute missing data with patterns over the time? (potential of K-means clustering for imputation of missing values!?)
Years ago, I read in the paper that they proposed a K-means-based approach to impute missing values over energy time data. At the point in time, since I did not have access to that data, I tried to ...
3
votes
1
answer
89
views
Calculating closest point on cluster boundary from point of interest
I am working on a cluster analysis. I have 4 clusters with about 35,000 datapoints. I got relatively strong clusters. I am in marketing and this is for segmentation. One of these clusters has a very ...
3
votes
1
answer
110
views
Would K-Means clustering work for cleaning up tabular data with lots of columns?
I just came across a k-means question here and it inspired me to think of k-means as a solution to my challenge.
The challenge: I deal with ecommerce data and no input file I receive is good enough to ...
0
votes
0
answers
42
views
Agglomerative clustering classifies 98% of my data in 1 cluster. Why?
I have a JSD distance matrix that I'm trying to cluster. When generating 24 clusters (roughly the amount the shows up on the clustermap), it assigns vast majority of the data as 1 cluster. Weirdly ...
2
votes
2
answers
140
views
How to Set an Appropriate Max Clusters Number for Elbow Method in Clustering with Large Distance Matrix?
I'm working with a large distance (jensen-shannon) matrix (6K x 6K) for clustering, and I'm using the elbow method to determine the optimal number of clusters. However, I'm noticing a significant ...
5
votes
1
answer
141
views
Why can't K-Means be used effectively on TF-IDF text data?
I've been working with text data and using TF-IDF for feature extraction. I want to cluster 1000 amazon reviews into subcategories. I want to use unsupervised learning. Unfortunately I read that K-...
1
vote
1
answer
112
views
K-means algorithm for multiple variables
I am a new to ML and current in reading about K-Means algorithm and trying it out with ORANGE tool. After going through several examples on YouTube and various other places, I am slightly confused on ...
2
votes
1
answer
61
views
Does performing k-NN on the centroids of clusters obtained from k-means make sense mathematically?
While playing around with some text embeddings, I used k-means clustering to get 4 clusters. I also have the labels for these embeddings, and I may simply use k-NN to classify new embeddings. However, ...
0
votes
1
answer
45
views
What is normalized winning frequency in kernel self organizing map(SOM)?
In the k-means based kernel SOM, proposed by MacDonald and Fyfe (2000), the update of the mean is based on a soft learning algorithm
mi(t + 1) = mi(t) + Λ[φ(x) − mi(t)]
where Λ is the normalized ...
-1
votes
1
answer
105
views
TypeError: unhashable type: 'slice' K-Means; Custom code for K-Means
Problem Statement
The goal is to have the K-Means customer code run for clusters and not use scikit-learn libraries. Learning exercise. This K-means has the standard predict, fix, centroids, cluster ...
1
vote
1
answer
43
views
K means clustering of image with k=1 vs mean of all pixels
I have relatively uniformly colored images and I extracted colors using k-means. k means 1 showed the best results for my modeling purposes, k means 2 not so much, and with k-means 3 there ceased to ...
0
votes
1
answer
56
views
Enhance clustering with evaluation function
My goal is to partition a dataset (X) in distinct clusters. I'm using k-means to be able to pick the center of each cluster assuming all other datapoints behave the ...
1
vote
2
answers
286
views
Kernel Kmeans formula
I'm trying to implement the Kernel Kmeans algorithm but I struggle with the following formula :
Let's say I have a case in one dimension with three points : 1, 5, 9. Let's say I want two clusters. ...
1
vote
1
answer
133
views
Kernel Kmeans implementation
I'm currently trying to implement the Kernel Kmeans from scratch. At the time I'm writing this post, my implementation is perfectly working on nested circles dataset or even on Iris dataset (see ...