1

Lets say I have a table called users, who looks something like the following:

users table:

|user_id | name | 
   1        Bob   
   2        Alice  
   3        Joe    
   4        Tom   

And another table called books, who got a "users" columns of array of integers relates to users ids who own the books (type: ARRAY[]::integer[])

books:

name     num_pages     users 
mybook1  130           {4,2,1,44}
mybook2  140           {1,2}
mybook3  145           {5,7,8,10}

unfortunately, some of the users has been deleted, and now I have to remove their user id reference from the table.

My question is - Which query should I run in order to remove from the books table the user_ids who are not exists anymore in "users" table? In this case I would like to remove any user id who is not 1/2/3/4 from the users column

for example on "books" table, mybook1 user id 44 does not exist anymore, or mybook3 ids 5,7,8,10 does not exist in users table

(I've created a trigger to handle it from now on when removing users but how can I remove the current "orphan users" who is not reflected in the users table anymore)

note: I have no foreign keys or something like that in the books table

Thanks a lot

3
  • 1
    This would be a simple DELETE statement if that was properly normalized Commented Apr 22, 2019 at 16:04
  • Use a joing table (or associative entity and get rid of arrays. They are a non-relational construct. I often think they're added to relational databases as a kind of "me-too" part of a feature list, rather than a real need - i.e. a need which couldn't have been dealt with relationally. Commented Apr 22, 2019 at 20:16
  • 1
    Because of software application architecture problems , I cant use a joing table or stuff like that.. I just simplified the problem for the question in the current situation Commented Apr 22, 2019 at 21:02

2 Answers 2

1

This should perform nicely:

UPDATE books AS b
SET    users = bu.users
FROM  (
   SELECT b1.name, u.users
   FROM   books b1
   CROSS  JOIN LATERAL (
      SELECT ARRAY(
         SELECT user_id
         FROM   unnest(b1.users) WITH ORDINALITY u(user_id, ord)
         JOIN   users u USING (user_id)
         ORDER  BY u.ord
         ) 
      ) u(users)
   ) bu
WHERE   b.name = bu.name
AND     b.users <> bu.users;

Preserves original order of array elements. If you don't need to, drop ORDINALITY & friends from the query.

Can result in empty arrays.

Assuming ...

  • books.name is the PK. Else use some other UNIQUE column from books instead.
  • users.user_id is at least UNIQUE.
  • books.users does not contain NULL values. Else those are eliminated as well.
0
CREATE OR REPLACE FUNCTION intersection(anyarray, anyarray) RETURNS anyarray as $$
                    SELECT ARRAY(
                    SELECT $1[i]
                    FROM generate_series( array_lower($1, 1), array_upper($1, 1) ) i
                    WHERE ARRAY[$1[i]] && $2
                    );
                    $$ language sql;

DO
            $BODY$
                DECLARE
                    user_ids INTEGER[];
                    row books%ROWTYPE;
                BEGIN
                    user_ids := (SELECT array_agg(id) FROM users) ;

                    FOR row IN
                        SELECT * FROM books
                    LOOP
                        UPDATE books
                        SET users=intersection(row.users,user_ids) 
                        WHERE id = row.id;
                    END LOOP;
                END;
            $BODY$ language plpgsql

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.