How to pass pass parameter in sql query using Pyspark in Jupyter Notebook

Ask Question

Asked 4 years ago

Modified 4 years ago

Viewed 593 times

I wrote the following SQL query in Jupyter notebook using Pyspark session -

MySparkSession.sql('''
    select ID
         , count(distinct transaction) as Txn_count
         , sum(revenue) as Total_sales
         , count(distinct product) as Total_products
      from merge_table
     where ( DATE between '2021-02-01' and '2021-03-31') 
       and (BRAND_NAME ='ADIDAS') 
''').show()

I need to pass the DATE and BRAND_NAME columns as parameters and have no idea how to do it so that only by changing the BRAND_NAME and DATE, I can get filtered data.

any help is appreciated.

edited Nov 20, 2021 at 14:40

Daeho Ro

13.7k4 gold badges25 silver badges50 bronze badges

asked Nov 19, 2021 at 10:48

Mayank Pathak

11 bronze badge

why not using the f-string?

Daeho Ro
– Daeho Ro

2021-11-20 14:41:19 +00:00
Commented Nov 20, 2021 at 14:41
1

Does this answer your question? How do I create a multiline Python string with inline variables?

Daeho Ro
– Daeho Ro

2021-11-20 14:43:03 +00:00
Commented Nov 20, 2021 at 14:43

Add a comment |

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

How to pass pass parameter in sql query using Pyspark in Jupyter Notebook

0

Linked

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Linked