python pandas sorting by group

Question

Each row in my DataFrame is a user vote entry for a restaurant. The data look like

id   cuisine    
91   american   
3    american   
91   american   
233  cuban      
233  cuban      
2    cuban

where id refers to the restaurant.

I want to get something like the following

american  91   100
          3    30
          12   10
cuban     233  80
          2    33
mexican   22   99
          8    98
          21   82

where the 2nd column is the id, and the 3rd column is the number of rows in the DataFrame for that id. In other words, sort by the number of rows, but group by cuisine. I tried

g = df.groupby(['cuisine', 'id'])
c = g.size().sort_values(ascending=False)

But the order of the cuisines is mixed.

take a look at pivot tables : pandas.pydata.org/pandas-docs/stable/generated/… — dot.Py, Commented Jul 19, 2016 at 16:48

piRSquared · Accepted Answer · 2016-07-19 17:27:51Z

2

use value_counts after group_by followed by sort_index

# ascending=[1, 0] says True for level[0], False for level[1]
df.groupby('cuisine').id.value_counts().sort_index(ascending=[1, 0])

cuisine   id 
american  91     2
          3      1
cuban     233    2
          2      1
Name: id, dtype: int64

answered Jul 19, 2016 at 17:27

piRSquared

295k66 gold badges506 silver badges649 bronze badges

The value_counts() is sorted in descending order by default. Even the final sort can be saved!
– nos
Commented Jul 19, 2016 at 17:37

Add a comment |

MaxU - stand with Ukraine · Accepted Answer · 2016-07-19 17:06:26Z

2

is that what you want?

In [2]: df
Out[2]:
    id   cuisine
0   91  american
1    3  american
2   91  american
3  233     cuban
4  233     cuban
5    2     cuban

In [3]: df.groupby(['cuisine', 'id']).size()
Out[3]:
cuisine   id
american  3      1
          91     2
cuban     2      1
          233    2
dtype: int64

or as a data frame:

In [10]: df.groupby(['cuisine', 'id']).size().reset_index(name='count').sort_values(['cuisine', 'count'], ascending=[1,0])
Out[10]:
    cuisine   id  count
1  american   91      2
0  american    3      1
3     cuban  233      2
2     cuban    2      1

edited Jul 19, 2016 at 17:06

answered Jul 19, 2016 at 16:56

MaxU - stand with Ukraine

211k37 gold badges402 silver badges433 bronze badges

Sorry my example was too simple. On my real data, if I do what you suggested, then the count within the cuisine is not sorted.
– nos
Commented Jul 19, 2016 at 17:03
@nos, i've updated my answer - please check. Please provide a reproducible sample and desired data sets in future
– MaxU - stand with Ukraine
Commented Jul 19, 2016 at 17:06
@nos, please check piRSquared's solution - it's much better
– MaxU - stand with Ukraine
Commented Jul 19, 2016 at 17:33

Add a comment |

Collectives™ on Stack Overflow

python pandas sorting by group

2 Answers 2

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Linked

Related