MySQL group-by very slow

Question

I have the folowwing SQL query

SELECT CustomerID FROM sales WHERE `Date` <= '2012-01-01' GROUP BY CustomerID

The query is executed over 11400000 rows and runs very slow. It takes over 3 minutes to execute. If I remove the group-by part, this runs below 1 second. Why is that?

MySQL Server version is '5.0.21-community-nt'

Here is the table schema:
CREATE TABLE `sales` (
  `ID` int(11) NOT NULL auto_increment,
  `DocNo` int(11) default '0',
  `CustomerID` int(11) default '0',
  `OperatorID` int(11) default '0',
  PRIMARY KEY  (`ID`),
  KEY `ID` (`ID`),
  KEY `DocNo` (`DocNo`),
  KEY `CustomerID` (`CustomerID`),
  KEY `Date` (`Date`)
) ENGINE=MyISAM AUTO_INCREMENT=14946509 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

Not sure if you posted the actual query or not. But in this query what would be the need to GROUP BY if there are no grouping functions? — Aziz Shaikh
– Aziz Shaikh, Commented Apr 23, 2012 at 10:38
In this case, use DISTINCT in your query and remove GROUP BY. Something like SELECT DISTINCT CustomerID ... — Aziz Shaikh
– Aziz Shaikh, Commented Apr 23, 2012 at 10:55

dplante · Accepted Answer · 2013-03-07 03:57:58Z

30

Try putting an index on (Date,CustomerID).

Have a look at the mysql manual for optimizing group by queries:- Group by optimization

You can find out how mysql is generating the result if you use EXPLAIN as follows:-

EXPLAIN SELECT CustomerID FROM sales WHERE `Date` <= '2012-01-01' GROUP BY CustomerID

This will tell you which indexes (if any) mysql is using to optimize the query. This is very handy when learning which indexes work for which queries as you can try creating an index and see if mysql uses it. So even if you don't fully understand how mysql calculates aggregate queries you can create a useful index by trial and error.

edited Mar 7, 2013 at 3:57

dplante

2,4513 gold badges21 silver badges27 bronze badges

answered Apr 23, 2012 at 10:37

rgvcorley

2,9234 gold badges25 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Arthur Goldsmith Over a year ago

As someone who's just beginning to get the hang of optomizing queries and tables, this little nugget was invaluable. Thank you.

rgvcorley Over a year ago

@ArthurGoldsmith No worries :)

pathfinder Over a year ago

@rgvcorley - I seriously owe you lunch. Why I didn't know about this indexing thing I don't know. But dayam that is fast now :)

Daan · Accepted Answer · 2012-04-23 10:38:19Z

5

Without knowing what your table schema looks like, it's difficult to be certain, but it would probably help if you added a multiple-column index on Date and CustomerID. That'd save MySQL the hassle of doing a full table scan for the GROUP BY statement. So try ALTER TABLE sales ADD INDEX (Date,CustomerID).

answered Apr 23, 2012 at 10:38

Daan

3,34826 silver badges19 bronze badges

Comments

IT ppl · Accepted Answer · 2012-04-23 10:38:41Z

2

try this one :

SELECT distinct CustomerID FROM sales WHERE `Date` <= '2012-01-01'

answered Apr 23, 2012 at 10:38

IT ppl

2,6472 gold badges41 silver badges57 bronze badges

2 Comments

cproinger Over a year ago

in mysql distinct is just a special case of a group by dev.mysql.com/doc/refman/5.1/de/distinct-optimization.html

Lorien Brune Over a year ago

SELECT DISTINCT is much faster for me than SELECT...GROUP BY. On a 15 million row table with appropriate indexes and sorting the results ASC, SELECT...GROUP BY takes about 3.5 seconds, while SELECT DISTINCT takes about 0.1 seconds.

Miguel Angel Cañedo · Accepted Answer · 2017-01-14 02:04:31Z

2

I had the same problem, I changed the key fields to the same Collation and that fix the problem. Fields to join the tables had different Collate value.

answered Jan 14, 2017 at 2:04

Miguel Angel Cañedo

211 bronze badge

Comments

Tom van der Woerdt · Accepted Answer · 2012-04-23 10:38:07Z

0

Wouldn't this one be a lot faster and achieve the same?

SELECT DISTINCT CustomerID FROM sales WHERE `Date` <= '2012-01-01'

Make sure to place an index on Date, of course. I'm not entirely sure but indexing CustomerID might also help.

answered Apr 23, 2012 at 10:38

Tom van der Woerdt

30.1k7 gold badges76 silver badges105 bronze badges

Collectives™ on Stack Overflow

MySQL group-by very slow

5 Answers 5

3 Comments

Comments

2 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

3 Comments

Comments

2 Comments

Comments

Comments

Linked

Related