2

We have a system that maintains multiple counters. Each incoming request performs an increment or decrement (delta) on one or more counters.

Incoming throughput: ~500 requests per second

For correctness/history, on every request we -

1. For correctness and auditing, every request 
   first writes a log record representing the delta.
2. Update the corresponding counter(s)

Both operations currently happen in a single database transaction.

However, there is a possibility that - Many or all requests target same counter. This will cause high contention, as counter updates serialise due to row-level locking. Latency can increases significantly under load.

Approach - Instead of updating counters on every request, we do instead updated by aggregation of last n records in background single threaded rollup worker -

1. Scans n log records since a monotonic checkpoint (could be db primary key).
2. Aggregates deltas in memory.
3. Updates counters and checkpoint to last scanned record primary          
   key in one DB transaction.
4. This process runs continuously in a loop.

Log table -

id counter_id delta
1 abc +3
2 abc -1
3 vr +50
4 tt -1
5 vr +20

Checkpoint table - contains single row only here

id checkpoint_id
1 3 // means till id = 3 in log table we have scanned.

Counter table -

id counter_id delta
1 abc 2
2 vr 70
3 tt -1

Configuration -

MySQL 8.0 - Single primary instance
~8 vCPUs
~32 GB RAM
Logs table indexed by monotonic primary key
Counters updated via batched UPSERTs

Can a single background worker reliably keep up with 500 RPS when rolling up log-based counter updates, and if we have to run multiple workers for better speed / availability, how is exclusive processing typically enforced to avoid double-counting - meaning if we have a fixed set of workers processing logs and aggregating them into counters is there a safe way to ensure that no 2 workers would ever pick the same counter to process, as this might lead to double counting, is there a safe way to do this operation ?

Tried solving for multiple workers apporach for aggregation - Avoiding deadlocks while rolling up log-based counters with multiple background workers (MySQL)

4
  • Have you considered eliminating the counter table and replace it with a view over the log table? The value of keeping a running total is often overrated especially when calculating it is a simply addition. Commented Dec 29, 2025 at 21:34
  • Did not get you @JimmyJames, can you explain what's view over log table here ? Commented Dec 30, 2025 at 8:54
  • Question is subject on meta Commented Jan 4 at 6:42
  • @here I have tried solving for multiple workers apporach for aggregation - softwareengineering.stackexchange.com/questions/460659/… Commented Jan 4 at 8:19

4 Answers 4

2

One technique to massively reduce contention is to have multiple counters. Instead of one counter you have say 10. The “official” counter is the sum of those ten counters. To update the “official” counter, I pick one of the ten counters at random. Contention is now much lower.

1
  • yeah this makes sense, but the question is more around, if we can't go with the distributed counter approach here, how can we do safe aggregation in background to derive counter values using multiple workers Commented Jan 4 at 8:31
2

Is a single worker thread sufficient to keep up with ~500 RPS in real-world systems?

There is no way we would know it, as there are absolutely no details to test it. Maybe you're running the thing on a Raspberry Pi 3, with the database file on an old USB flash drive. It should still probably work fine, but who knows.

There is, however, a definitive way for you to answer your own question.

Do the test.

how can this be safely scaled to multiple workers without double-counting (i.e., ensuring workers don’t process same log records)?

Sharding. Or communication between workers. Or a message queue system.

But before considering those approaches, test if the most basic system works. What happens when you process your 500 requests by changing the counters in database?

That's literally a few dozens of lines of code to write.

I believe we should still run multiple (3-4) workers for availability while guaranteeing that only one worker updates counters and checkpoint at a time? How do we ensure this safely?

Define safely.

Sooner or later, you will face the situation where you'll have to chose between:

  1. Not receiving a message.
  2. Receiving the same message multiple times.

Depending on the business case, it may make sense to lose messages. The counter doesn't get changed; that's it.

Or maybe it is critical to have the counter changed no matter what. In which case, you'll have to implement an additional feature where the requester would continue sending the same message until the server acknowledges it received it. This means that you'll probably need to de-duplicate messages, for instance by storing their identifiers in the database to know whether you already processed them or not.

Actual test

Following the advice from above, here's what you get.

The server has two routes:

@app.route('/<int:counter>/increment', methods=['POST'])
def increment(counter: int) -> flask.Response:
    ...

@app.route('/<int:counter>/decrement', methods=['POST'])
def decrement(counter: int) -> flask.Response:
    ...

Each route executes two SQL queries. The first one adds an entry in the log table:

insert into operations
(counter_id, original_value, increment)
values (%(id)s, (select value from counters where id = %(id)s), %(increment)s)

The second one adjusts the actual counter:

update counters set value = value + %(delta)s where id = %(id)s

As they run in a transaction, there would be no inconsistencies between the two tables if something goes wrong.

On client side, a simple script creates a given number of changes to run (generate_changes). A change affects one of the five counters, given that the probability is that it would be the first counter in most of the cases, to test the “high contention” scenario you mentioned. Here's the probability I picked:

indexes = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 3, 4, 4]

Each change either increments or decrements a counter, given that I arbitrarily decided that counters will never go below zero.

The client application starts by clearing all the data, then runs the changes, in parallel. The duration of this second action is measured and reported:

def process_change(c: Change):
    action = 'increment' if c.increment else 'decrement'
    uri = f'http://localhost:7492/{c.counter}/{action}'
    requests.post(uri).raise_for_status()

start_time = time.time()
changes = generate_changes(count_changes)
arguments = ((c,) for c in changes)

with multiprocessing.Pool() as pool:
    pool.starmap(process_change, arguments)

end_time = time.time()
print(f'Ran {count_changes} changes in {(end_time - start_time):.3f} s.')

The overall thing takes about thirty minutes to draft (or less: I was being particularly slow on this one).

With an Intel i5-10600 3.30 GHz CPU with a Force MP510 SSD, PostgreSQL 15 and Python 3.13.5, running 2,000 changes takes 2.27 s. That's about 880 changes per second. When I change the test to perform 10,000 changes or 50,000 changes, I find the very same 880 changes/s. speed.

If I don't reset the changes, i.e. accumulate the data, the speed seems to remain the same. I haven't tested what happens if the database contains millions of rows.

During the runs, all CPU cores are used around 65%. Here's a view of two consecutive runs:

enter image description here

Although I haven't checked it, my impression is that the actual CPU load comes from the client script, not the server or the database. As a matter of fact, removing parallelism on client side leads to the process taking ten times as long as its parallel variant.

Conclusion:

A naive implementation, with no optimizations, can easily reach a rate of 880 changes/second to the counters on “ordinary” hardware, given that it is not impossible that the actual rate is much higher, if the slowness ends up to be on client side in my test.

The implementation being so basic, it is easy to implement as a proof of concept, check the actual performance on production hardware, and, if needed, act accordingly by using one of the most common approaches:

  • Distributing the work to multiple servers (horizontal scalability).
  • Moving to better hardware (vertical scalability).
  • Optimizing.
9
  • Added some more clarification in question Commented Dec 26, 2025 at 21:48
  • @tusharRawat: clarification you added, but still no test I suggested you to do. I edited my answer to include the actual test. Enjoy. Commented Dec 31, 2025 at 21:11
  • Thanks for sharing the test results. However here are we not scanning the logs, , the test we did here is - if for each request we perform both steps for logging + inc/dec counter in transaction, than throughput comes out to be 800 ops/sec, Did we test that here ? Meaning on an average max latency for each request would have reached around = 1000/ 800 = 1.25 ms, is it true ? Commented Jan 1 at 8:39
  • @tusharRawat: the test does only the two steps you listed in your question, that is, (1) writing a log record, and (2) updating the counter. Commented Jan 1 at 9:06
  • 1
    @tusharRawat: no, this is your “let's fix the hypothetical high contention by introducing scanning and aggregating.” My original point is that instead of doing premature optimization for some unchecked hypotheses, you can start by trying the naive approach first, and, in this particular case, find out that it works pretty well—just like my test shows. Commented Jan 1 at 9:20
0

Is a single worker thread sufficient to keep up with ~500 RPS in real-world systems ?

There will always be a way to make the answer to this be, “no”.

Given that, the thing to care about is finding a way to make things better.

You’ve already hit on the idea of using aggregation to reduce query overhead. Your next idea is parallelizing the aggregation.

Let’s say you’re a bank. You’re processing deposits and withdrawals. Your aggregator is just someone grabbing a stack of them.

The easy way to paralyze the aggregation is to make different stacks out of the ones that act on different accounts.

Aggregators - and their territory

id | range
1  | a-g
2  | h-m
3  | n-t
4  | u-z

With territory worked out ahead of time you avoid all contention on the log table.

Log table -

id | counter_id | delta

1  | abc        | +3
2  | abc        | -1
3  | vr         | +50 
4  | tt         | -1 
5  | vr         | +20 

Assignments - Aggregators can independently scan the log

Aggregator id | Log id
           1  | 1 
           1  | 2
           3  | 4
           4  | 3
           4  | 5

Done this way you don’t actually have to shard the DB. Your aggregating workers can stake out responsibilities in the log ahead of time and stay out of each others way. They don't need to communicate with each other at all. You will need them to adjust their territory as the number of them change.

One school of thought has you bind the thread to the resource. So if your CPU had 4 cores you'd dedicate 4 threads to reading the log. However, what the OS is up to with other running programs at the time can impact how optimal this is.

But before doing that, do yourself a favor and actually look at some of these logs. That is likely more important than these numbers you’re throwing around.

10
  • Correct, the challenging part we are trying to figure out is making sure at any point in time if different workers are doing the aggregation from same table, they don't pick the same counters to do the aggregation, even if they do only one worker should be able to proceed for that turn and others who picked same counter should be rejected/retry the process, want strong guarantees that different aggregation workers work on disjoint set. Commented Dec 27, 2025 at 8:29
  • Added the log table structure in the question. Commented Dec 27, 2025 at 17:39
  • @tusharRawat Added some tables myself. Commented Dec 30, 2025 at 13:43
  • hm, make sense we will probably need some checkpoint also on our Aggregators table right ? Also in case if we are lagging behind we can inc workers would still be something we will have to carefully do, as we are doing inc/dec operation here, which is not idempotent, 2 workers aggregating same counter at same time will be wrong. Commented Dec 30, 2025 at 19:20
  • @tusharRawat The whole point of marking out territory for the workers ahead of time is to entirely prevent the possibility of 2 workers aggregating the same counter. The need for an atomic operation has been narrowed down to when the number of workers change, and with it their ranges of territory. You must do that safely. But you don't need to do that often. Commented Dec 31, 2025 at 1:01
0

This is a bit of a "frame challenge" meaning that it's not a direct answer to your question as written but offering a different approach to a working solution.

You say you need a counter stored as a row in a DB, but if we step back, for a moment, I think what you really need is a way to see the current total for a category/counter_id. When we reframe the problem that way, we can see that storing a total in a table is not a necessary step. Calculating the total is necessary but storing that value is not.

At any point in time, if we need to know the most current total for the 'abc' counter, we can run a query like this:

SELECT sum(delta) 
  FROM counter_log
 WHERE counter_id = 'abc'

This will give you the most up-to-date total possible for the 'abc' counter given the current state of the database. It will never be 'stale'. There aren't a lot of ways to get the wrong answer with this approach, either.

The problem that most people think this creates is that your query is more expensive to run. This is true but in all likelihood, you are going to be retrieving these totals much less frequently than you are writing deltas. Throw in the fact that reads are much less costly than writes and that a RDBMS tends to be highly optimized for reads. Any decent database will allow for reads to be done without locking the tables being read from. That is, there is low of no contention at all when you are reading, no matter how many other sessions are reading or writing to the tables being read from.

In contrast, writes to a table are costly. And as you correctly note, updates to exiting rows from different connections may require locks to be ensure correctness and that leads to latency and other performance issues.

Inserts, on the other hand require less coordination between processes. An insert-only approach, where there is no possibility for collisions between rows, is much less costly. There may be some contention around indexes, which I will come back to.

I am not saying that storing a precalculated total is always a bad idea but you need to weigh the costs of that against the benefits. The main problem with this kind of thing is that any solution that performs well is likely to contain subtle defects. This could include errors in the total, possible deadlocks, and other thorny issues. Unless the calculation has a significant cost, it's almost never worth bothering with 'caching' the final result. You might want to do this as an archival strategy, however. For example, at the end of the day (week, month, etc.), you may want to write a final total off to separate total and cleanup the log table.

In your case, I would argue that you are optimizing around the cheap (nearly free) cost of querying the current total and making the hard costly part of writing the new records slower, more costly and harder to get working. Querying the total is very straightforward and robust. Also, I would wager that your current approach will lead to more queries against the log table than if you just let the query run when the total is actually needed.

You don't technically need this but if you have dependencies on the existing counter table you can create a view with the same name:

CREATE VIEW counter AS
SELECT counter_id, sum(delta) 
  FROM counter_log
GROUP BY counter_id

A common misconception around views is that they are more costly than a normal query. This isn't the case, the DB will literally rewrite the query and execute it the same way as if you wrote it out explicitly. They can be problematic if they hide costly queries. The above is not terribly costly but I won't completely dismiss that concern. If you are really worried about it, just use the explicit aggregate query as needed and skip the view.

With regard to indexes on your counter log, if you have high write volume, you may want to consider using a reverse index on your surrogate key (the serial integer.) This kind of index minimizes contention when you have a monotonically increasing key, as in the case of using a sequence or identity column.

You will want to make sure you index on the counter id in the log table. There's a small risk of contention on the index when two inserts are made on the same counter but that is unlikely to be noticeable or create any real issues in practice.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.