1

I am new to Pyspark and having few clarifications on writing dataframe to oracle database table using JDBC.

As part of the requirement I need to read the data from Oracle table and perform transformation using pyspark and load the final dataframe to oracle Table using JDBC, need to execute oracle stored procedure before inserting data into table in same jdbc sessions please let me know any options available for the same in the dataframe save() method.

There is option "sessionInitStatement" while reading data from the Oracle table, similarly any option while dataframe.write, below is my pyspark statements.

pl_sql_block="begin initialise_employee(); end;"
jdbcDF= spark.read \
    .option("sessionInitStatement",pl_sql_block) \
    .jdbc(url, "jdbc:Oracle:dbserver", table="scott.Employee", properties={"user": "username", "password":"password" }) 

/* Transformation statements ............. */

jdbcDF.write \
    .format("jdbc") \
    .option("driver", "oracle.jdbc.driver.OracleDriver") \
    .option("url", "jdbc:Oracle:dbserver") \
    .option("dbtable", "Employee") \
    .option("user", "username") \
    .option("password", "password") \
    .save()

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.