18

I'm migrating data from one table to another in an environment where any long locks or downtime is not acceptable, in total about 80000 rows. Essentially the query boils down to this simple case:

INSERT INTO table_2
SELECT * FROM table_1
JOIN table_3 on table_1.id = table_3.id

All 3 tables are being read from and could have an insert at any time. I want to just run the query above, but I'm not sure how the locking works and whether the tables will be totally inaccessible during the operation. My understanding tells me that only the affected rows (newly inserted) will be locked. Table 1 is just being selected, so no harm, and concurrent inserts are safe so table 2 should be freely accessible.

Is this understanding correct, and can I run this query in a production environment without fear? If it's not safe, what is the standard way to accomplish this?

1 Answer 1

29

You're fine.

If you're interested in the details, you can read up on multiversion concurrency control, or on the details of the Postgres MVCC implementation, or how its various locking modes interact, but the implications for your case are nicely summarised in the docs:

reading never blocks writing and writing never blocks reading

In short, every record stored in the database has some version number attached to it, and every query knows which versions to consider and which to ignore.

This means that an INSERT can safely write to a table without locking it, as any concurrent queries will simply ignore the new rows until the inserting transaction decides to commit.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you! Quick confirmation, though. Reading and writing won't block each other, but for example, say this query takes 5 seconds to run. During that period will other inserts into table 2 be delayed?
@ssb: No, they won't. Because the effects of an INSERT are hidden from other sessions by the MVCC visibility rules, it doesn't need to lock anyone out, and consequently, will never be stuck waiting to acquire a lock. There is presumably a very brief critical section while it reserves a physical storage location for a new row, but nothing which is held for the duration of the statement. (The one exception to this is if two inserts try to put the same value in a UNIQUE column; in this case, the second statement will block until the first either commits or rolls back.)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.