0

I have a table with data structured like this. Each product ID has a list of element IDs--for each element, there is a dictionary including a list of elements and their assigned IDs. Not every element will have an ID on every product

product_id element_id
product_1 {"FIRE": ["1630808"],"WATER": ["188028","234"],"SHADOW": ["213181"]

For each product I'd like to be able to count how many of each element ID appear, in a table like this:

product_id fire_count water_count shadow_count forest_count
product_1 1 2 1 0

I've tried using the LATERAL FLATTEN function with KEY and VALUE, but I'm getting duplicate results and wonder if there is a more crisp way of writing this type of query, especially because I also need to count instances where an ID does not appear.

My data is stored in Snowflake and I query it using Snowflake SQL.

Any advice? Thank you!

2

2 Answers 2

2

It can be achieved without flattening and aggregation:

CREATE OR REPLACE TABLE TAB(PRODUCT_ID, ELEMENT_ID) AS SELECT 'product_1',
{'FIRE':['1630808'],'WATER':['188028','234'],'SHADOW':['213181']};

SELECT
  PRODUCT_ID,
  ARRAY_SIZE(ELEMENT_ID:FIRE) AS FIRE_COUNT,
  ARRAY_SIZE(ELEMENT_ID:WATER) AS WATER_COUNT,
  ARRAY_SIZE(ELEMENT_ID:SHADOW) AS SHADOW_COUNT,
FROM TAB;
/*
+------------+------------+-------------+--------------+
| PRODUCT_ID | FIRE_COUNT | WATER_COUNT | SHADOW_COUNT |
+------------+------------+-------------+--------------+
| product_1  |          1 |           2 |            1 |
+------------+------------+-------------+--------------+
*/
Sign up to request clarification or add additional context in comments.

1 Comment

I'd also suggest wrapping each value with COALESCE(..., 0)
1

I'd do the following...

  • flatten just one level
  • use ARRAY_SIZE() to count the elements in the arrays
  • use conditional aggregation to pivot the results
SELECT
  src.product_id,
  SUM(CASE WHEN f.key = 'fire'   THEN ARRAY_SIZE(f.value) ELSE 0 END)   AS fire_count,
  SUM(CASE WHEN f.key = 'water'  THEN ARRAY_SIZE(f.value) ELSE 0 END)   AS water_count,
  SUM(CASE WHEN f.key = 'shadow' THEN ARRAY_SIZE(f.value) ELSE 0 END)   AS shadow_count
FROM
  your_table   AS src
CROSS JOIN LATERAL
  FLATTEN (
    INPUT => src.element_id,
    OUTER => TRUE 
  )
    AS f
GROUP BY
  src.product_id

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.