Consider the following product table (which is highly trimmed down):
`id` int AUTO_INCREMENT
`category_id` int
`subcategory_id` int
`vendor_id` int
`price` decimal(6,2)
`inserted_at` timestamp
For a given category ID, I am attempting to retrieve a list containing the vendor with the lowest latest price for each subcategory. With "latest" I mean that vendors may have multiple prices for a given category ID/subcategory ID combination, so only the most recently inserted price for that category ID/subcategory ID/vendor ID should be used. If there's a tie between 2 or more vendor's prices, the lowest id should be used as the tie-breaker.
For example, with this data:
id | category_id | subcategory_id | vendor_id | price | inserted_at
---------------------------------------------------------------------------
1 | 1 | 2 | 3 | 16.00 | 2015-07-23 04:00:00
2 | 1 | 1 | 2 | 9.00 | 2015-07-26 08:00:00
3 | 1 | 2 | 4 | 16.00 | 2015-08-02 10:00:00
4 | 1 | 1 | 1 | 7.00 | 2015-08-04 11:00:00
5 | 1 | 1 | 1 | 11.00 | 2015-08-09 16:00:00
So, first find the most recent prices for every subcategory/vendor combination (row with price=7.00 would be removed because it's not the most recent for that vendor in that subcategory). Then for subcategory 1 the lowest price would be 9 (so vendor_id = 2) and for subcategory 2 the lowest price is 16 (two vendors tie ()ids 3 and 4) so we choose the one with lowest vendor_id = 3).
I would expect the following results for category_id = 1:
subcategory_id | vendor_id | price
----------------------------------
1 | 2 | 9.00
2 | 3 | 16.00
Here's what I have so far. I feel like it's already starting to get out of hand and this doesn't even account for ties between 2 or more vendor's prices.
SELECT c.subcategory_id, c.vendor_id, c.price
FROM products AS c
JOIN
(
SELECT MIN(a.price) AS min_price,
a.subcategory_id
FROM products AS a
JOIN
(
SELECT MAX(`inserted_at`) AS latest_price_time,
vendor_id,
subcategory_id
FROM products
WHERE category_id = 1
GROUP BY vendor_id, subcategory_id
) AS b
ON a.inserted_at = b.latest_price_time AND a.vendor_id = b.vendor_id AND a.subcategory_id = b.subcategory_id
WHERE a.category_id = 1
GROUP BY a.subcategory_id
) AS d
ON c.price = d.min_price AND c.subcategory_id = d.subcategory_id
WHERE c.category_id = 1
Before I go any further, I wanted to see if there was an easier way. When it comes to grouping/aggregating results of additional groupings/aggregations, is there a method that will give me the best performance (most important) and/or be easier to read (less important)?