2

Problem statement: I am using the Repository pattern to pull and update records from a PostgreSQL DB (version 11) into a Node.JS API via the pg npm module. If two users try to modify the same record at almost the same time, the changes made by the first user to submit will be overwritten by the second user to submit. I want to prevent the second user's changes from being submitted until they have updated their local copy of the record with the first user's changes.

I know that a DB like CouchDB has a "_rev" property that it uses in this case to detect an attempt to update from a stale snapshot. As I've researched more, I've found this is called optimistic locking (Optimistic locking queue). So I'll add a rev column to my table and use it in my SQL update statement.

UPDATE tableX
SET field1 = value1,
    field2 = value2,
    ...,
    rev_field = uuid_generate_v4()
WHERE id_field = id_value
  AND rev_field = rev_value

However, if id_value is a match and rev_value isn't that won't tell my repository code that the record was stale, only that 0 rows were affected by the query.

So I've got a script that I've written in pgAdmin that will detect cases where the update affected 0 rows and then checks the rev_field.

DO $$
DECLARE
    i_id numeric;
    i_uuid uuid;
    v_count numeric;
    v_rev uuid;
BEGIN
    i_id := 1;
    i_uuid := '20b2e135-42d0-4a49-94c0-5557dd09abd1';

    UPDATE account_r
    SET account_name = 'savings',
        rev = uuid_generate_v4()
    WHERE account_id = i_id
      AND rev = i_uuid;

    GET DIAGNOSTICS v_count = ROW_COUNT;
    
    IF v_count < 1 THEN
        SELECT rev INTO v_rev
        FROM account_r
        WHERE account_id = i_id;
        
        IF v_rev <> i_uuid THEN
            RAISE EXCEPTION 'revision mismatch';
        END IF;
    END IF;
    
    RAISE NOTICE 'rows affected: %', v_count;
END $$;

While I'm perfectly comfortable adapting this code into a stored proc and calling that from Node, I'm hoping that there's a solution to this that's not nearly as complex. On the one hand, moving these functions to the DB will clean up my JS code, on the other hand, this is a lot of boilerplate SQL to write, since it will have to be done for UPDATE and DELETE for each table.

Is there an easier way to get this done? (Perhaps the code at Optimistic locking queue is the easier way?) Should I be looking at an ORM to help reduce the headache here?

5
  • 3
    Little known is the fact that you get optimistic locking automatically if you use the REPEATABLE READ isolation level. Commented Jul 17, 2020 at 18:52
  • @LaurenzAlbe I'm confused how that would help in this case since the data has been pulled from the DB. REPEATABLE READ looks like it would help if the transactions occurred at the same time. But in the case of an API, one client may present data to a user and get changes back from the UI that no longer apply, since another user has already changed the record in the meantime.
    – mlibby
    Commented Jul 17, 2020 at 21:45
  • The requirement is that you read and write in the same transaction, so yes, if there are user interaction in between, that technique is out. Commented Jul 18, 2020 at 4:10
  • @LaurenzAlbe how do you get "optimistic locking" automatically for the REPEATABLE READ? Could you please write more about it?
    – Zazaeil
    Commented Jan 10, 2023 at 17:22
  • 2
    @Zazaeil If somebody updates the data between the time that you read them and your update, you will automatically get a "serialization error". You catch the error, rollback and retry the transaction. Commented Jan 10, 2023 at 22:33

1 Answer 1

5

There is no need to maintain a rev value. You can get the md5 hash of a table's row.

SQL Fiddle Here

create table mytable (
  id int primary key,
  some_text text,
  some_int  int,
  some_date timestamptz
  );

insert into mytable 
  values (1, 'First entry', 0, now() - interval '1 day'),
         (2, 'Second entry', 1, now()),
         (3, 'Third entry', 2, now() + interval '1 day');

select *, md5(mytable::text) from mytable order by id;

The fiddle includes other queries to demonstrate that the calculated md5() is based on the values of the row.

Using that hash for optimistic locking, the updates can take the form:

update mytable
   set some_int = -1
 where id = 1
   and md5(mytable::text) = <md5 hash from select>
returning *

You will still need to check for no return rows, but that could be abstracted away on the Node side.

It looks like result.row_count contains the number of rows affected, so you will not need the returning * part.

1
  • Thanks, this looks like a useful way to avoid the extra field. I'll still need to run a second query to figure out whether "0 rows affected" comes from a mismatch on ID or on the MD5. But perhaps that will be just fine as a secondary check done in Node.
    – mlibby
    Commented Jul 17, 2020 at 17:53

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.