1

Database in use: 5.6 (I can't use LAG function which is from mysql 8)

I have the following table structure in mysql

book_id     | Version      | Rating | Price 
varchar(25) | Decimal(10,2)| int    | Decimal(10,2)

I have a web page where I show two charts.

In first chart, I will show count of books for rating(ratings are only from 1 to 4) but only of the latest versions. Query1

In second chart, I will show count of books in a price range but only of the latest versions. Query2

These two queries are ran one after the other every time the web page is loaded or refreshed. Although data remains constant, I get different results sometimes for the same query.

I have the following two queries which are almost identical

QUERY1

SELECT  
       SUM(CASE WHEN rating=1 THEN 1 ELSE 0) AS rating1,
       SUM(CASE WHEN rating=2 THEN 1 ELSE 0) AS rating2,
       SUM(CASE WHEN rating=3 THEN 1 ELSE 0) AS rating3,
       SUM(CASE WHEN rating=4 THEN 1 ELSE 0) AS rating4
FROM (
       SELECT rating, row_number
       FROM (
              SELECT rating, 
                     @num:=IF(@group:=book_id, @num+1, 1) row_number,
                     @group:=book_id bi
              FROM book_database
              ORDER BY book_id, version DESC
             ) book
      HAVING book.row_number = 1
    ) book

QUERY2

SELECT  
       SUM(CASE WHEN price <= 1000 THEN 1 ELSE 0) AS cheap,
       SUM(CASE WHEN price >1000 THEN 1 ELSE 0) AS costly
FROM (
       SELECT price, row_number
       FROM (
              SELECT price, 
                     @num:=IF(@group:=book_id, @num+1, 1) row_number,
                     @group:=book_id bi
              FROM book_database
              ORDER BY book_id, version DESC
             ) book
      HAVING book.row_number = 1
    ) book

There are multiple screens in my webpage and with multiple queries but most of them work on same logic. Basically I will query on latest version of any book, and hence I use the nested query.

On some occasions I get different results than intended when same queries ran multiple times on same dataset.

Is my query correct? Is usage of variables causing this issue? Since multiple queries are ran in parallel (Although in different database connections) usage of variables is the suspect?

6
  • That is the beauty of using an ORDER BY clause in a sub query. MySQL might choose to ignore it when it feels like it! Commented Sep 28, 2018 at 8:18
  • can you elaborate Commented Sep 28, 2018 at 8:19
  • See mysqlserverteam.com/… Commented Sep 28, 2018 at 8:32
  • I read through the article... it doesn't apply in this case Commented Sep 28, 2018 at 8:39
  • What else would explain getting different results? The point is that a nested order by could be ignored. I suppose you can replace HAVING with WHERE... this should produce the same result and eliminate unnecessary GROUP BY and might produce the correct result. Commented Sep 28, 2018 at 8:42

1 Answer 1

0

Your use of variables is not correct. MySQL does not guarantee the order of evaluation of expressions in a SELECT clause, so you should set all the variables at the same time.

For instance, the first query should look something more like this:

SELECT SUM( rating = 1 ) AS rating1,
       SUM( rating = 2 ) AS rating2,
       SUM( rating = 3 ) AS rating3,
       SUM( rating = 4 ) AS rating4
FROM (SELECT rating, book_id,
             (@rn := if(@b = book_id, @rn + 1,
                        if(@b := book_id, 1, 1)
                       )
             ) as rn
      FROM (SELECT rating, book_id, version
            FROM book_database
            ORDER BY book_id, version DESC
           ) book CROSS JOIN
           (SELECT @rn := 0, @b := -1) params
    ) book
WHERE rn = 1;

The important part is the part that involves the variables. I also simplified some of the other logic.

But, you don't need variables for this:

SELECT SUM( rating = 1 ) AS rating1,
       SUM( rating = 2 ) AS rating2,
       SUM( rating = 3 ) AS rating3,
       SUM( rating = 4 ) AS rating4
FROM book_database b
WHERE b.version = (SELECT MAX(b2.version)
                   FROM book_database b2
                   WHERE b2.book_id = b.book_id
                  );

With an index on book_database(book_id, version), this should be faster than the version using variables.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.