Unique array values for this string_to_array

Question

This is a follow-up to:

Best way to map different JSON keys to same target columns

Based on these sample tables:

data_providers:
id | field_map
--------------
1  | {"segments": "SEGMENT IDS", "full_name": "FULL NAME"}

leads:
id | data_provider_id | email | data
------------------------------------
1  | 201              | hi@hi | {"SEGMENT IDS": "id1,id1,id1,id2,id3", "FULL NAME": "John Doe"}
2  | 201              | xx@xx | {"FULL NAME": "Billy Bob"}

desired output:

data_provider_id | email | full_name | segment
----------------------------------------------
201              | hi@hi | John Doe  | id1
201              | hi@hi | John Doe  | id2
201              | hi@hi | John Doe  | id3
201              | xx@xx | Billy Bob | NULL

I have the following query:

SELECT
  leads.data_provider_id,
  leads.email,
  leads.data->>(p.field_map->>'full_name') AS full_name,
  segment
FROM leads
LEFT OUTER JOIN data_providers p ON p.id = leads.data_provider_id
LEFT JOIN LATERAL unnest(string_to_array(leads.data->>(p.field_map->>'segments'), ',')) AS segment ON true

This query is doing 2 particular things:

its joining on data_providers table to get the field_map column which contains a JSONB mapping if CSV column headers. So something like {"segments": "SEGMENT ID", "full_name": "FULL NAME"}
Within the data JSONB column of leads, there is a key (which I discover through the field map above) that contains a comma separated string of segment_ids (it comes in a CSV and they chose to put 2 values within 1 row). I want to split it so each segment_id gets its own row (and obviously all other columns remain the same on both rows).

I have 2 goals:

If there is an empty string or the key doesn't exist within the map, I want to return the row but just with NULL for the segment_id. I already got this working by changing CROSS JOIN to LEFT JOIN.
I'm trying to remove duplicates in segment ids, so if someone enters 'id1,id1' it should only produce 1 row. I do this because there is a unique index on that column for the materialized view.

I'm currently stuck on #2.

I think you will have a better chance of getting answer(s) if you provide a dbfiddle.uk/?rdbms=postgres_14. Most people will just skip to the next question as soon as they realize they will have to create table- and insert- statement's — Lennart - Slava Ukraini
– Lennart - Slava Ukraini, Commented Mar 14, 2022 at 19:45

Erwin Brandstetter · Accepted Answer · 2022-03-15 18:57:40Z

2

Make it a subquery and throw in DISTINCT:

SELECT l.data_provider_id
     , l.email
     , l.data->>(p.field_map->>'full_name') AS full_name
     , s.segment
FROM   leads l
LEFT   JOIN data_providers p ON p.id = l.data_provider_id
LEFT   JOIN LATERAL (
   SELECT DISTINCT segment
   FROM   unnest(string_to_array(l.data->>(p.field_map->>'segment'), ',')) AS segment
   ) s ON true

Your field_map holds the key 'segment', not 'segments', btw.

You could even use this short syntax:

...
LEFT   JOIN LATERAL (
   SELECT DISTINCT unnest(string_to_array(l.data->>(p.field_map->>'segment'), ','))
   ) s(segment) ON true

(But the last one might make unsuspecting SQL purists cringe.)

Original order of array elements is not preserved. If you need that, see:

How to preserve the original order of elements in an unnested array?

And use GROUP BY rather than DISTINCT and also aggregate the minimum ordinal position for each group of duplicates.

edited Mar 15, 2022 at 18:57

answered Mar 15, 2022 at 2:27

Erwin Brandstetter

187k28 gold badges465 silver badges639 bronze badges

1

@Tallboy: I would have tested my solution if you had provided a fiddle ...

Erwin Brandstetter
– Erwin Brandstetter

2022-03-15 02:39:48 +00:00
Commented Mar 15, 2022 at 2:39
Awesome thank you! I will also provide fiddle if I can't get this working. You are the Postgres demigod

Tallboy
– Tallboy

2022-03-15 18:44:58 +00:00
Commented Mar 15, 2022 at 18:44

Add a comment |

Stack Exchange Network

Unique array values for this string_to_array

1 Answer 1

Linked

Hot Network Questions

Unique array values for this string_to_array

1 Answer 1

Linked

Related

Hot Network Questions