I am attempting to write a script that will allow me to insert values from an uploaded dataframe into a table inside of an Oracle DB; but my issue lies with
- too many columns to hard-code
- columns aren't one-to-one
What I'm hoping for is a way to write out the columns, check to see if they sync with the columns of my dataframe and from there use an INSERT VALUES sql statement to input the values from the dataframe to the ODS table.
so far these are the important parts of my script:
import pandas as pd
import cx_Oracle
import config
df = pd.read_excel("Employee_data.xlsx")
conn = None
try:
conn = cx_Oracle.connect(config.username, config.password, config.dsn, encoding=config.encoding)
except cx_Oracle.Error as error:
print(error)
finally:
cursor = conn.cursor
sql = "SELECT * FROM ODSMGR.EMPLOYEE_TABLE"
cursor.execute(sql)
data = cursor.fetchall()
col_names = []
for i in range(0, len(cursor.description)):
col_names.append(cursor.description[i][0])
#instead of using df.columns I use:
rows = [tuple(x) for x in df.values]
which prints my ODS column names, and allows me to conveniently store my rows from the df in an array but I'm at a loss for how to import these to the ODS. I found something like:
cursor.execute("insert into ODSMGR.EMPLOYEE_TABLE(col1,col2) values (:col1, :col2)", {":col1df":df, "col2df:df"})
but that'll mean I'll have to hard-code everything which wouldn't be scalable. I'm hoping I can get some sort of insight to help. It's just difficult since the columns aren't 1-to-1 and that there is some compression/collapsing of columns from the DF to the ODS but any help is appreciated.
NOTE: I've also attempted to use SQLalchemy but I am always given an error "ORA-12505: TNS:listener does not currently know of SID given in connect descriptor" which is really strange given that I am able to connect with cx_Oracle
EDIT 1:
I was able to get a list of columns that share the same name; so after running:
import numpy as np
a = np.intersect1d(df.columns, col_names)
print("common columns:", a)
I was able to get a list of columns that the two datasets share.
I also tried to use this as my engine:
engine = create_engine("oracle+cx_oracle://username:[email protected]:1521/?ODS-Test")
dtyp = {c:types.VARCHAR(df[c].str.len().max())
for c in df.columns[df.dtypes=='object'].tolist()}
df.to_sql('ODS.EMPLOYEE_TABLE', con = engine, dtype=dtyp, if_exists='append')
which has given me nothing but errors.
df.to_sql(…)
df.to_sql(..)
And the columns aren't one-to-one in that there's some columns inside the DB that aren't in my uploaded DF. Other columns are collapsed via case statements which are meant to act as new columns that hold the same info as the DB.