Skip to main content

Questions tagged [clustering]

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval etc.

3 votes
3 answers
196 views

I have a dataset which contains ~15 features. With the elbow method, I found out that the optimal number of clusters is probably four. Therefore, I applied the K-means algorithm with four clusters. ...
1 vote
1 answer
38 views

Assume we have a dendogram (hierarchical clusterisation tree), can we define a data partitioning in K clusters, by cutting the branches of the tree at some levels in the tree below the root node?
0 votes
2 answers
155 views

I want to predict which device got used in which room. Therefore I've got device and sensor data. My idea was to create a feature vector lie this: ...
2 votes
1 answer
270 views

I'm attempting to use minhash to generate clusters and similarities, and I am primarily using ideas from these resources. http://www2007.org/papers/paper570.pdf https://chrisjmccormick.wordpress.com/...
3 votes
2 answers
3k views

I'm looking for an implementation of k-modes in pyspark. I found this and this as implementations. First, I tried implementing k-modes using the first link and faced issues. So I went ahead and tried ...
2 votes
1 answer
112 views

I have N time varying feature vectors obtained by recording different parameters over time.This results in N*N similarity matrix which contains one to one correlations value for each feature. We need ...
2 votes
1 answer
277 views

I am working with a mixed data set, corresponding to TV consumption data, with the aim of reducing the number of features to only those relevant to detect TV consumption patterns (or consumption ...
1 vote
1 answer
85 views

I have a large dataset with mixed (numerical, categorical, textual) data that I need to classify. The clusters are well-defined, but multidimensional (i.e. vector-valued) and have a varying structure ...
2 votes
1 answer
253 views

I have a social science background and I'm doing a text mining project. I'm looking for advice about the choice of the number of topics/clusters when analyzing textual data. In particular, I'm ...
0 votes
1 answer
109 views

I'm new to data analysis and I need to do a data analysis project using clustering methods for a course in R. I have no idea how to start and choose my data set. I'm looking for some resources. Is ...
2 votes
2 answers
101 views

I would like to have some suggestions on possible avenues that would make sense in the following context. 3 Optimal clusters have been identified in a 5000 list of customers using Kmeans Data model ...
0 votes
1 answer
94 views

I am trying to think through my process before doing any real coding. However, got really confused easily. Say I have 100 instruments and I know their price movements every day for a year. So I can ...
2 votes
1 answer
105 views

How can I perform conceptual clustering in sklearn? My use case is that I have English Wikipedia articles that I'm doing unsupervised learning on (tfidf -> truncated svd -> l2 normalize), and I'd like ...
2 votes
1 answer
728 views

I have a question regarding grouping of similar words for example I have list of words give below: artificialintelligence Artificial Intelligence AI Machine Learning ML Data Analytics Data & ...
1 vote
1 answer
42 views

Given data that contain n vector (m×1 each). I want to cluster the data based on distance. Also, each vector in the data is labeled to some category. I used kmeans algorithm (in Matlab) to cluster the ...

15 30 50 per page
1
2 3 4 5
92