1

I have the following database schema. It looks at shoppers and how many orders they have made from three websites in a network.

ID Name Country website1_Orders website2_Orders website3_Orders
123 JOHNC USA null 1 null
456 KAYLAB USA 5 null null
789 LAURAT USA 2 6 3
999 RONR CA 1 null 16
017 MATTE CA 7 null 4
767 JROB MX null 1 null
224 TINAS MX null null null
670 TOMR MX null 8 null

What I want my SQL output to look like is as follows:

Country Websites_Avail
USA 3
CA 2
MX 1

The logic is that, if no customer in their country has made an order from website1, website2, or website3, then this website does not service that particular country at this time.

So basically, across multiple columns, I need to figure out how to properly aggregate and show the correct number of results, broken out by country. This is a very simple sample of the database - which is much larger.

with count as 
(
    select
        country,
        case  
            when website1_orders is not null 
                then 'Web 1' 
        end as Web_One,
        case 
            when website2_orders is not null 
                then 'Web 2' 
        end as Web_Two,
        case 
            when website3_orders is not null 
                then 'Web 3' 
        end as Web_Three 
    from 
        my_database
)
select 
    country, 
    COUNT(DISTINCT Web_One) + COUNT(DISTINCT Web_Two) + COUNT(DISTINCT Web_Three) as total_count
from 
    count
group by 
    1

This is a very simple version (there are 20 sites in total) and it works for me in theory if I were to just look at these 8ish rows. But it is not scaling and I'm not sure why. I also do not think this is the best way to aggregate across the columns either. But It's all I can think of at this moment.

I would prefer not to do anything like normalizing in a new temp table, but if that's the way to go I'm open to trying to figure out how. But I was hoping within a CTE I could get the correct counts.

Essentially, if a customer in any country makes an order from any site, then 1 is added to the unique total_count at the end. No state can be more than 20 (which would mean at least one customer from that country has made an order from all of the 20 sites at some point). But I'm getting values into the thousands.

Is there an optimal way of looking at this? It's just Postgres SQL in Snowflake.

New contributor
DJR is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.

3 Answers 3

1

I would avoid SUM or COUNT here because you don't really want to count or sum anything. You simply want to check if the columns have at least one value which is NOT NULL in any row per country, the exact number of such rows or even the values in those rows don't matter at all.

That's a good use case for BOOL_OR whose purpose is to apply exactly such kind of logic.

You just need to convert its result from boolean to int to sum it up:

SELECT
  country,
  BOOL_OR(website1_Orders IS NOT NULL)::int +
  BOOL_OR(website2_Orders IS NOT NULL)::int +
  BOOL_OR(website3_Orders IS NOT NULL)::int AS Websites_Avail
FROM my_database
GROUP BY country;

See this db<>fiddle with your sample data.

1

For the example shown, even if you increase the number of sites to 20, you can do it without unpivot or dynamic SQL.
The main thing is to immediately group by country. This will significantly reduce the size of the calculated part.

select country
  ,case when sum(website1_orders) is null then 0 else 1 end
  +case when sum(website2_orders) is null then 0 else 1 end
  +case when sum(website3_orders) is null then 0 else 1 end
    as total_count
from my_database
group by country
country total_count
USA 3
CA 2
MX 1

Or use other aggregate function

select country
  ,case when min(website1_orders) is null then 0 else 1 end
  +case when min(website2_orders) is null then 0 else 1 end
  +case when min(website3_orders) is null then 0 else 1 end
    as total_count
from my_database
group by country
select country
  ,sum(website1_orders) w1s
  ,sum(website2_orders) w2s
  ,sum(website3_orders) w3s
from my_database
group by country
country w1s w2s w3s
USA 7 7 3
CA 8 null 20
MX null 9 null

fiddle

0

Essentialy, you are asking for number of distinct websites that someone ordered from per country of the person that made the order - try it like here:

SELECT country, Count(Distinct websites) as websites
FROM  ( Select country, 
               unnest(array['website1_orders','website2_orders','website3_orders'])  AS websites, 
               unnest(array[website1_orders,website2_orders,website3_orders]) AS orders
       From my_database
     )
WHERE orders Is Not Null
GROUP BY country
ORDER BY websites Desc
country websites
USA 3
CA 2
MX 1

fiddle

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.