1
$\begingroup$

I have data obtained from a survey and I would like to make a grouping of the individuals who responded to the survey according to the questions they answered. The range of answers is: strongly agree, agree, neutral, not satisfied, not at all satisfied.

Since this is categorical data I have thought of making the following change:

strongly agree= 0.5
agree=0.4
neutral=0.3
not satisfied=0.2
not at all satisfied=0.1

Below is an example of the transformed dataset:

q1,q2,q3
0.1,0.2,0.3
0.4,0.3,0.2
0.1,0.1,0.1

Would it be possible now to use K-means and obtain consistent results or should I use some other method such as K-modes?

$\endgroup$

1 Answer 1

1
$\begingroup$

That depends on whether you think the clusters are best represented by the average (mean) of their members or by the most frequent value of their members.

Unless you're using a fancy distance function in your clustering set-up, think of the problem in terms of spacial ordening in a scenario with only two dimensions. Since the default distance function is eucledean, the behavior does not fundamentally change in multi-dimensional space.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.