20

I have the folowwing SQL query

SELECT CustomerID FROM sales WHERE `Date` <= '2012-01-01' GROUP BY CustomerID

The query is executed over 11400000 rows and runs very slow. It takes over 3 minutes to execute. If I remove the group-by part, this runs below 1 second. Why is that?

MySQL Server version is '5.0.21-community-nt'

Here is the table schema:
CREATE TABLE `sales` (
  `ID` int(11) NOT NULL auto_increment,
  `DocNo` int(11) default '0',
  `CustomerID` int(11) default '0',
  `OperatorID` int(11) default '0',
  PRIMARY KEY  (`ID`),
  KEY `ID` (`ID`),
  KEY `DocNo` (`DocNo`),
  KEY `CustomerID` (`CustomerID`),
  KEY `Date` (`Date`)
) ENGINE=MyISAM AUTO_INCREMENT=14946509 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
4
  • 1
    can you post table schema (create script of table) Commented Apr 23, 2012 at 10:34
  • 3
    Not sure if you posted the actual query or not. But in this query what would be the need to GROUP BY if there are no grouping functions? Commented Apr 23, 2012 at 10:38
  • Aziz, I need to return the unique values of customerID Commented Apr 23, 2012 at 10:44
  • 2
    In this case, use DISTINCT in your query and remove GROUP BY. Something like SELECT DISTINCT CustomerID ... Commented Apr 23, 2012 at 10:55

5 Answers 5

30

Try putting an index on (Date,CustomerID).

Have a look at the mysql manual for optimizing group by queries:- Group by optimization

You can find out how mysql is generating the result if you use EXPLAIN as follows:-

EXPLAIN SELECT CustomerID FROM sales WHERE `Date` <= '2012-01-01' GROUP BY CustomerID

This will tell you which indexes (if any) mysql is using to optimize the query. This is very handy when learning which indexes work for which queries as you can try creating an index and see if mysql uses it. So even if you don't fully understand how mysql calculates aggregate queries you can create a useful index by trial and error.

Sign up to request clarification or add additional context in comments.

3 Comments

As someone who's just beginning to get the hang of optomizing queries and tables, this little nugget was invaluable. Thank you.
@ArthurGoldsmith No worries :)
@rgvcorley - I seriously owe you lunch. Why I didn't know about this indexing thing I don't know. But dayam that is fast now :)
5

Without knowing what your table schema looks like, it's difficult to be certain, but it would probably help if you added a multiple-column index on Date and CustomerID. That'd save MySQL the hassle of doing a full table scan for the GROUP BY statement. So try ALTER TABLE sales ADD INDEX (Date,CustomerID).

Comments

2

try this one :

SELECT distinct CustomerID FROM sales WHERE `Date` <= '2012-01-01'

2 Comments

in mysql distinct is just a special case of a group by dev.mysql.com/doc/refman/5.1/de/distinct-optimization.html
SELECT DISTINCT is much faster for me than SELECT...GROUP BY. On a 15 million row table with appropriate indexes and sorting the results ASC, SELECT...GROUP BY takes about 3.5 seconds, while SELECT DISTINCT takes about 0.1 seconds.
2

I had the same problem, I changed the key fields to the same Collation and that fix the problem. Fields to join the tables had different Collate value.

Comments

0

Wouldn't this one be a lot faster and achieve the same?

SELECT DISTINCT CustomerID FROM sales WHERE `Date` <= '2012-01-01'

Make sure to place an index on Date, of course. I'm not entirely sure but indexing CustomerID might also help.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.