0

I'm using PostgreSQL with big tables, and query takes too much time.

I have two tables. The first one has about 6 million rows (data table), and the second one has about 30000 rows (users table).

Each user has about 200 rows in data table. Later, data and users tables may increase up to 30 times.

My query is:

SELECT d.name, count(*) c
  FROM users AS u JOIN data AS d on d.id = u.id  
  WHERE u.language = 'eng' GROUP BY d.name ORDER BY c DESC LIMIT 10;

90% of users has eng language, and query time is 7 seconds. Each column is indexed!

I read about Merge Join and it should be really fast, so I sorted tables by id and forced Merge Join, but time increased up to 20 seconds.

I suppose, the tables configuration is wrong, but I don't know how to fix it.

Should I make other improvements?

1

1 Answer 1

0

For this query:

SELECT d.name, count(*) c
FROM users u JOIN
     data d
     on d.id = u.id  
WHERE u.language = 'eng'
GROUP BY d.name
ORDER BY c DESC
LIMIT 10;

First, try indexes: users(language, id), data(id, name). See if this speeds up the query.

Second, what is d.name? Can a user have more than one of them? Is there a table of valid values? Depending on the answers to these questions, there may be other ways to structure the query.

1
  • I tried these indexes, and time didn't change... d.name is some string, and each user has about 200 rows in data table. It means: user has about 200 unique names values. I just want to know, how popular is this name among English, German and other users. Commented Jun 11, 2015 at 15:41

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.