2

I’m building an anonymous chat app for learning purpose where users are matched based on gender preferences. The application allows users to go online and be matched with another online user if their gender preferences match. However, I'm encountering an issue where users can end up in circular matches involving three or more users when they go online simultaneously.

I’m using PostgreSQL with Prisma and Node.js (Express) for this application. Here’s the current setup:

Database Table:

model User {
  id                 String   @id @default(uuid())
  currentChatPartner String?
  gender             String?
  preferGender       String?
  lastMatch          String?
  currentCountry     String?
  ageRange           String?
  createdAt          DateTime @default(now())
  report             Int      @default(0)
  online             Boolean?
}

Matching Algorithm:

async findMatch(id: string): Promise<User[] | null> {
  return await prisma.$transaction(async (tx) => {
    // Lock the current user
    const [user] = await tx.$queryRaw<User[]>`
      SELECT * FROM "User"
      WHERE "id" = ${id}
      FOR UPDATE
    `;

    // Ensure the user exists
    if (!user) throw new Error("User not found");

    // Lock the potential match
    const [match] = await tx.$queryRaw<User[]>`
      SELECT * FROM "User"
      WHERE "online" = true
        AND "gender" = ${user.preferGender} 
        AND "id" != ${id}
        AND ("lastMatch" != ${id} OR "lastMatch" IS NULL)
        AND "currentChatPartner" IS NULL
      FOR UPDATE
      LIMIT 1
    `;

    if (match) {
      // Check if either user is already engaged in another chat
      if (user.currentChatPartner || match.currentChatPartner) {
        throw new Error("User is already engaged in another chat");
      }

      // Update the current user
      await tx.user.update({
        where: { id },
        data: {
          lastMatch: match.id,
          online: false,
          currentChatPartner: match.id,
        },
      });

      // Update the matched user
      await tx.user.update({
        where: { id: match.id },
        data: {
          lastMatch: id,
          online: false,
          currentChatPartner: id,
        },
      });

      return [user, match];
    } else {
      // If no match found, set the user to online again
      await tx.user.update({
        where: { id },
        data: {
          online: true,
        },
      });
      return null;
    }
  });
}

Issue Despite using row-level locking with FOR UPDATE, my application still encounters cases where three users are matched with each other in a circular fashion when they come online simultaneously. The FOR UPDATE lock doesn't seem to be preventing these circular matches effectively.

Questions

  1. Is there an issue with the current approach using FOR UPDATE for locking?
  2. How can I refine the matching algorithm to effectively prevent circular matches?
  3. Are there any additional strategies or best practices for handling this type of situation in a real-time chat application?

What I've Tried

  • Used FOR UPDATE to lock rows during the match process.
  • Added checks to ensure users are not already engaged in another chat.
1
  • Change FOR UPDATE for a FOR UPDATE SKIP LOCKED and you should be good. The fact that you lock the row only makes concurrent requests wait for it to be released, but they still use that same record later. SKIP LOCKED will make them skip to a different one instead. Alternatively, you can use NOWAIT to make them fail with an exception you can handle in your app.
    – Zegarek
    Commented Sep 8, 2024 at 12:17

1 Answer 1

0
  1. Is there an issue with the current approach using FOR UPDATE for locking?

Yes. If multiple concurrent clients get the same "potential match", they just queue up, hang and wait for everyone else to finish their transaction, then they go ahead and re-use the record. You're not locking them against the collision, only against using it all at the same time.

  1. How can I refine the matching algorithm to effectively prevent circular matches?

As per the comment: SKIP LOCKED and you should be fine. Alternatively, you can use NOWAIT to make them fail with an exception you can handle in your app.

    // Lock the current user //
    const [user] = await tx.$queryRaw<User[]>`
      SELECT * FROM "User"
      WHERE "id" = ${id}
      FOR UPDATE NOWAIT--fail here if the user is already locked
    `;
    // Here, add handling for the case when this user is already locked
    // Ensure the user exists
    if (!user) throw new Error("User not found");

    // Lock the potential match
    const [match] = await tx.$queryRaw<User[]>`
      SELECT * FROM "User"
      WHERE "online" = true
        AND "gender" = ${user.preferGender} 
        AND "id" != ${id}
        AND ("lastMatch" != ${id} OR "lastMatch" IS NULL)
        AND "currentChatPartner" IS NULL
      FOR UPDATE 
      SKIP LOCKED --skip to the nearest unlocked instead of waiting for this one
      LIMIT 1
    `;

You can reproduce your problem at db<>fiddle using dblink to emulate a concurrent client:

BEGIN TRANSACTION;

SELECT * FROM "User"
WHERE "id" = 'c63416ba-0031-4f29-b403-ae99f0ba25a6'
FOR UPDATE NOWAIT;

SELECT * FROM "User"
WHERE "online" = true
AND "gender" = 'undisclosed'
AND "id" != 'c63416ba-0031-4f29-b403-ae99f0ba25a6'
AND ("lastMatch" != 'c63416ba-0031-4f29-b403-ae99f0ba25a6' 
      OR "lastMatch" IS NULL)
AND "currentChatPartner" IS NULL
FOR UPDATE 
--SKIP LOCKED
LIMIT 1;
--all this is taking place while our transaction remains open
--dblink issues its queries from another, concurrent client
select dblink_connect('another_client','');
select dblink_send_query('another_client',
$s$SELECT "User" FROM "User"
   WHERE "online" = true
     AND "gender" = 'undisclosed'
     AND "id" != 'c24c877f-d1ff-4b93-bb99-cda86403384f'--different user
     AND ("lastMatch" != 'c24c877f-d1ff-4b93-bb99-cda86403384f' 
          OR "lastMatch" IS NULL)
     AND "currentChatPartner" IS NULL
   FOR UPDATE 
   --SKIP LOCKED
   LIMIT 1;
$s$);
select pg_sleep(1);

COMMIT TRANSACTION;

This happened from the perspective of user c63: they got their own record and locked it

id currentChatPartner gender preferGender lastMatch currentCountry ageRange createdAt report online
c63416ba-0031-4f29-b403-ae99f0ba25a6 null undisclosed undisclosed null CH 20-35 2024-09-09 16:55:08.265834+00 0 t

They got a potential match and locked their record too:

id currentChatPartner gender preferGender lastMatch currentCountry ageRange createdAt report online
3eb3688b-51ad-4cf3-8c18-1025ed102df7 null undisclosed undisclosed null CH 20-35 2024-09-09 16:55:08.265834+00 0 t

The transaction is now committed. You can check that the concurrent request from user c24 got served a potential match of user c63 even though it was locked during the time of the request and it already got matched with 3eb:

select (_.a).* from dblink_get_result('another_client',false)_(a "User");
id currentChatPartner gender preferGender lastMatch currentCountry ageRange createdAt report online
c63416ba-0031-4f29-b403-ae99f0ba25a6 null undisclosed undisclosed null CH 20-35 2024-09-09 16:55:08.265834+00 0 t

This would end up pairing c63 with both 3eb and c24. Without SKIP LOCKED:

  1. They are free to select a match who is currently picking someone else.
  2. They are also free to select a match who is currently being picked by someone else.
  3. Concurrent users that hit a duplicate match hang and wait the entire time the colliding selection is taking place.

No longer happening after you uncomment SKIP LOCKED. I've added pg_sleep(1) because otherwise dblink might not have enough time to run the async request before transaction finishes and the locks are released. In real life scenario I assume you actually update both Users to pair them up so that they no longer meet the potential match criteria.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.