I'm fairly sure that GROUP BY and DISTINCT have roughly the same execution plan.
The difference here since we have to guess (since we don't have the explain plans) is IMO that the inline subquery gets executed AFTER the GROUP BY but BEFORE the DISTINCT.
So if your query returns 1M rows and gets aggregated to 1k rows:
- The
GROUP BYquery would have run the subquery 1000 times, - Whereas the
DISTINCTquery would have run the subquery 1000000 times.
The tkprof explain plan would help demonstrate this hypothesis.
While we're discussing this, I think it's important to note tatthat the way the query is written is misleading both to the reader and to the optimizer: you obviously want to find all rows from item/item_transactions that have a TASK_INVENTORY_STEP.STEP_TYPE with a value of "TYPE A".
IMO your query would have a better plan and would be more easily readable if written like this:
SELECT ITEMS.ITEM_ID,
ITEMS.ITEM_CODE,
ITEMS.ITEMTYPE,
ITEM_TRANSACTIONS.STATUS,
(SELECT COUNT(PKID)
FROM ITEM_PARENTS
WHERE PARENT_ITEM_ID = ITEMS.ITEM_ID) AS CHILD_COUNT
FROM ITEMS
JOIN ITEM_TRANSACTIONS
ON ITEMS.ITEM_ID = ITEM_TRANSACTIONS.ITEM_ID
AND ITEM_TRANSACTIONS.FLAG = 1
WHERE EXISTS (SELECT NULL
FROM JOB_INVENTORY
JOIN TASK_INVENTORY_STEP
ON JOB_INVENTORY.JOB_ITEM_ID=TASK_INVENTORY_STEP.JOB_ITEM_ID
WHERE TASK_INVENTORY_STEP.STEP_TYPE = 'TYPE A'
AND ITEMS.ITEM_ID = JOB_INVENTORY.ITEM_ID)
In many cases, a DISTINCT can be a sign that the query is not written properly (because it is extermely rare that a good query returnsshouldn't return duplicates).
Note also that 4 tables are not used in your original select.