union returns duplicate data in postgresql

Question

query using union returns duplicate values in column

select a.id , b.value  
from tableB b
JOIN  tableA a 
ON b.id_data = a.id_data 
JOIN tableC c 
ON b.id_data = c.id_data
WHERE b.place ='city' AND c.roll  = '20' 
UNION 
select c.id,  COALESCE(c.value, cast(c.number as text),cast(c.date as text)) 
FROM tableC c
where c.roll  = '15'

Expected result is below

id	value
1	data1
2	data2
3	data3
4	data4

But I am getting below result

id	value
1	data1
2	data2
3	data3
2	[null]
4	data4

tried using distinct for id as well still output contains duplicate id. Is there any other function to perform the above logic where i get unique id

select distinct(id) from
(select a.id , b.value  
from tableB b
JOIN  tableA a 
ON b.id_data = a.id_data 
JOIN tableC c 
ON b.id_data = c.id_data
WHERE b.place ='city' AND c.roll  = '20' 
UNION 
select c.id,  COALESCE(c.value, cast(c.number as text),cast(c.date as text)) 
FROM tableC c
where c.roll  = '15') as id

this query returns single column - unique id values but i need 2 columns in result set - id and values

Well, a UNION does an implicit DISTINCT - but that applies to all columns, not just the first one — user330315
– user330315, Commented Jan 11, 2023 at 11:24
So you have rows with same ID but different value. What's the rule to decide which ones aren't needed? — Álvaro González
– Álvaro González, Commented Jan 11, 2023 at 11:38

Laurenz Albe · Accepted Answer · 2023-01-11 11:34:30Z

0

UNION removes duplicate result rows, that is, result rows where all columns are the same.

You can use the PostgreSQL extension DISTINCT ON to get what you want:

SELECT DISTINCT ON (id) id, value
FROM (select a.id , b.value  
      from tableB b
         JOIN tableA a 
            ON b.id_data = a.id_data 
         JOIN tableC c 
            ON b.id_data = c.id_data
      WHERE b.place ='city' AND c.roll  = '20' 
      UNION ALL
      select c.id, COALESCE(c.value, cast(c.number as text), cast(c.date as text)) 
      FROM tableC c
      where c.roll  = '15') AS subq
ORDER BY id, value;

answered Jan 11, 2023 at 11:34

Laurenz Albe

257k22 gold badges312 silver badges388 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Álvaro González Over a year ago

Important: the ORDER BY clause is what decides which rows get stripped from the result. Without it, you'll get arbitrary (not even random) rows.

Collectives™ on Stack Overflow

union returns duplicate data in postgresql

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related