If you don't need to return the recursion depth as an output, then using UNION
instead of UNION ALL
is the way to go on both PostgreSQL and SQLite as mentioned at: https://stackoverflow.com/a/77165059/895245
However if you also want to track the depth of results, that won't do as the depth column makes the values become different from one another.
The solutions from the answer To find infinite recursive loop in CTE cover that as well however.
Test data
Create some demo data:
CREATE TABLE "graph" ("id" integer, "parent_id" integer);
INSERT INTO "graph" VALUES
(1, null), -- root element
(2, 1),
(3, 1),
(4, 3),
(5, 4),
(3, 5); -- endless loop
Graph represented:
1 --+--> 2
|
+--> 3 --> 4 --> 5 --+
^ |
| |
+---------------+
PostgreSQL 14+: CYCLE
keyword
WITH RECURSIVE "cte"("id", "parent_id", "depth") AS (
SELECT "id",
"parent_id",
0
FROM "graph"
WHERE "parent_id" IS NULL
UNION ALL
SELECT "graph"."id",
"graph"."parent_id",
"cte"."depth" + 1
FROM "graph"
INNER JOIN "cte"
ON "graph"."parent_id" = "cte"."id"
)
CYCLE "id" -- track cycles for this column
SET "is_cycle" -- adds a boolean column is_cycle
USING "path" -- adds a column that contains all parents for the id
SELECT *
FROM "cte"
WHERE NOT "is_cycle"
ORDER BY "id" ASC;
Output:
id | parent_id | depth | is_cycle | path
----+-----------+-------+----------+-------------------
1 | | 0 | f | {(1)}
2 | 1 | 1 | f | {(1),(2)}
3 | 1 | 1 | f | {(1),(3)}
4 | 3 | 2 | f | {(1),(3),(4)}
5 | 4 | 3 | f | {(1),(3),(4),(5)}
(5 rows)
So we understand that this method tracks the paths down to each element and uses that information to detect loops and stop before.
Detect loops instead of ignoring them
If we wanted instead to check if a loop exists in the DB we could just change:
WHERE NOT "is_cycle";
to:
WHERE "is_cycle";
which then gives us the loop path:
id | parent_id | depth | is_cycle | path
----+-----------+-------+----------+-----------------------
3 | 5 | 4 | t | {(1),(3),(4),(5),(3)}
See also: To find infinite recursive loop in CTE
PostgreSQL pre-14: explicit array manipulation
The CYCLE
keyword is a shortcut for the following which uses an explicit array to track the path and detect loops:
WITH RECURSIVE "cte"("id", "parent_id", "depth", "is_cycle", "path") AS (
SELECT "id",
"parent_id",
0,
false,
ARRAY["id"] as "path"
FROM "graph"
WHERE "parent_id" IS NULL
UNION ALL
SELECT "graph"."id",
"graph"."parent_id",
"cte"."depth" + 1,
"graph"."id" = any("path"),
"cte"."path" || "graph"."id"
FROM "graph"
INNER JOIN "cte"
ON "graph"."parent_id" = "cte"."id"
AND NOT "is_cycle"
)
SELECT *
FROM "cte"
WHERE NOT "is_cycle"
ORDER BY "id" ASC;
Output:
id | parent_id | depth | is_cycle | path
----+-----------+-------+----------+-----------
1 | | 0 | f | {1}
2 | 1 | 1 | f | {1,2}
3 | 1 | 1 | f | {1,3}
4 | 3 | 2 | f | {1,3,4}
5 | 4 | 3 | f | {1,3,4,5}
(5 rows)
Using UNION
instead of UNION ALL
when depth
is not needed
Just to give an example of that method:
WITH RECURSIVE "cte"("id", "parent_id") AS (
SELECT "id", "parent_id"
FROM "graph"
WHERE "parent_id" IS NULL
UNION
SELECT "graph"."id",
"graph"."parent_id"
FROM "graph"
INNER JOIN "cte"
ON "graph"."parent_id" = "cte"."id"
)
SELECT *
FROM "cte"
ORDER BY "id" ASC;
Output:
id | parent_id
----+-----------
1 |
2 | 1
3 | 1
3 | 5
4 | 3
5 | 4
(6 rows)
That exact same query also works on SQLite which doesn't have neither ARRAY
nor CYCLE
, output:
1|
2|1
3|1
3|5
4|3
5|4
Tested on PostgreSQL 16.4, SQLite 3.45.1, Ubuntu 24.04.