Adding columns in Spark dataframe based on rules

Ask Question

Asked 8 years, 6 months ago

Modified 6 years, 4 months ago

Viewed 2k times

I have a dataframe df, which contains below data:

**customers**   **product**   **Val_id**
     1               A            1
     2               B            X
     3               C               
     4               D            Z

i have been provided 2 rules, which are as below:

**rule_id**   **rule_name**  **product value**  **priority**
   123              ABC             A,B               1
   456              DEF             A,B,D             2

Requirement is to apply these rules on dataframe df in priority order, customers who have passed rule 1, should not be considered for rule 2 and in final dataframe add two more columns rule_id and rule_name, i have written below code to achieve it:

val rule_name = when(col("product").isin("A","B"), "ABC").otherwise(when(col("product").isin("A","B","D"), "DEF").otherwise(""))
val rule_id = when(col("product").isin("A","B"), "123").otherwise(when(col("product").isin("A","B","D"), "456").otherwise(""))
val df1 = df_customers.withColumn("rule_name" , rule_name).withColumn("rule_id" , rule_id)
df1.show()

Final output looks like below:

**customers**   **product**   **Val_id**  **rule_name**  **rule_id**
     1               A            1           ABC            123
     2               B            X           ABC            123
     3               C               
     4               D            Z           DEF            456

Is there any better way to achieve it, adding both columns by just going though entire dataset once instead of going through entire dataset twice?

asked Jun 3, 2017 at 17:34

Varun Chadha

312 bronze badges

Add a comment |

0 You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Stack Exchange Network

Adding columns in Spark dataframe based on rules

0

You must log in to answer this question.

Hot Network Questions

Adding columns in Spark dataframe based on rules

0

You must log in to answer this question.

Related

Hot Network Questions