Should K-modes or K-means be used?

Question

I have data obtained from a survey and I would like to make a grouping of the individuals who responded to the survey according to the questions they answered. The range of answers is: strongly agree, agree, neutral, not satisfied, not at all satisfied.

Since this is categorical data I have thought of making the following change:

strongly agree= 0.5
agree=0.4
neutral=0.3
not satisfied=0.2
not at all satisfied=0.1

Below is an example of the transformed dataset:

q1,q2,q3
0.1,0.2,0.3
0.4,0.3,0.2
0.1,0.1,0.1

Would it be possible now to use K-means and obtain consistent results or should I use some other method such as K-modes?

Anne · Accepted Answer · 2022-07-06 08:36:21Z

1

That depends on whether you think the clusters are best represented by the average (mean) of their members or by the most frequent value of their members.

Unless you're using a fancy distance function in your clustering set-up, think of the problem in terms of spacial ordening in a scenario with only two dimensions. Since the default distance function is eucledean, the behavior does not fundamentally change in multi-dimensional space.

answered Jul 6, 2022 at 8:36

Anne

212 bronze badges

Add a comment |

Stack Exchange Network

Should K-modes or K-means be used?

1 Answer 1

Hot Network Questions

Should K-modes or K-means be used?

1 Answer 1

Related

Hot Network Questions