0

I'm trying to find a regex expression in python that will be able to handle most of the UPDATE queries that I throw at if from my DB. I can't use sqlparse or any other libraries that may be useful with for this, I can only use python's built-in modules or cx_Oracle, in case it has a method I'm not aware of that could do something like this.

Most update queries look like this:

UPDATE TABLE_NAME SET COLUMN_NAME=2, OTHER_COLUMN=to_date('31-DEC-202023:59:59','DD-MON-YYYYHH24:MI:SS'), COLUMN_STRING='Hello, thanks for your help', UPDATED_BY=-100 WHERE CODE=9999;

Most update queries I use have a version of these types of updates. The output has to be a list including each separate SQL keyword (UPDATE, SET, WHERE), each separate update statement(i.e COLUMN_NAME=2) and the final identifier (CODE=9999).

Ideally, the result would look something like this:

list = ['UPDATE', 'TABLE_NAME', 'SET', 'COLUMN_NAME=2', 'OTHER_COLUMN=("31-DEC-2020 23:59:59","DD-MON-YYYY HH24:MI:SS")', COLUMN_STRING='Hello, thanks for your help', 'UPDATED_BY=-100', 'WHERE', 'CODE=9999']

Initially I tried doing this using a string.split() splitting on the spaces, but when dealing with one of my slightly more complex queries like the one above, the split method doesn't deal well with string updates such as the one I'm trying to make in COLUMN_STRING or those in OTHER_COLUMN due to the blank spaces in those updates.

3
  • .split(",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)") will do for you as explained here : answer
    – Younes
    Commented Feb 18, 2021 at 13:30
  • @Younes thanks for your reply. Unfortunately this regex doesn't deal well with the date values or string updates that have a comma in them as it seems to split a string on the commas. Also, I doesn't split the string at the UPDATE, SET, WHERE, keywords. This means that I end up with a UPDATE TABLE_NAME SET COLUMN_NAME=2 in the 0 index of my list, where I would like my 0th index to be UPDATE, 1st index to be TABLE_NAME, etc. Commented Feb 18, 2021 at 13:47
  • ah i get it, i will post a workaround answer that will get you what you want i hope.
    – Younes
    Commented Feb 18, 2021 at 14:29

1 Answer 1

0

Let's use the shlex module :

import shlex
test="UPDATE TABLE_NAME SET COLUMN_NAME=2, OTHER_COLUMN=to_date('31-DEC-202023:59:59','DD-MON-YYYYHH24:MI:SS'), COLUMN_STRING='Hello, thanks for your help', UPDATED_BY=-100 WHERE CODE=9999;"

t=shlex.split(test)

Up to here, we won't get rid of comma delimiters and the last semi one, so maybe we can do this :

for i in t:
    if i[-1] in [',',';']:
        i=i[:-1]

If we print every element of that list we'll get :

UPDATE
TABLE_NAME
SET
COLUMN_NAME=2
OTHER_COLUMN=to_date(31-DEC-202023:59:59,DD-MON-YYYYHH24:MI:SS)
COLUMN_STRING=Hello, thanks for your help
UPDATED_BY=-100
WHERE
CODE=9999

Not a proper generic answer, but serves the purpose i hope.

1
  • This was brilliant! Solved my issue perfectly. Many thanks! Commented Feb 18, 2021 at 15:23

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.