1

I'm using grpc.aio.server and I am stuck with a problem that if I try to make a load test on my service it will have some requests lagging for 10 seconds, but the requests are similar. The load is stable (200rps) and the latency of almost all requests are almost the same. I'm ok with higher latency, as long as it's stable. I've tried to google something like async task priority, but in my mind it means that something is wrong with the priority of tasks which wait a very long time, but they're finished or the full request task is waiting to start for a long time.

e.g 1000 requests were sent to the gRPC service, they all have the same logic to execute, the same db instance, the same query to the db, the same time to get results from db, etc, everything is the same. I see that e.g. 10th request latency is 10 seconds, but 13th request latency is 5 seconds. I can also see in logs, that the db queries have almost the same execution time.

Any suggestions? maybe I understand something wrong

2 Answers 2

2
+25

There are multiple reasons why this behaviour may happen. Here are a few things that you can take a look at:

  • What type of workload do you have? Is it I/O bound or CPU bound?

  • Does your code block the event loop at some point? The path for each request is fully asynchronous? The doc states pretty clear that blocking the event loop is costly:

Blocking (CPU-bound) code should not be called directly. For example, if a function performs a CPU-intensive calculation for 1 second, all concurrent asyncio Tasks and IO operations would be delayed by 1 second.

  • What happens with the memory when you see those big latencies? You can run a memory profiling using this tool and check the memory. There are high chances to see a correlation between the latency and an intense activity of the Python Memory Manager which tries to reclaim the memory. Here is a nice article around memory management that you can check out.
1
  • That's the right answer! Sorry for late response, I solved my problem in one week after I asked this question, the main problem was in CPU-bound blocking code, now I'm using asgiref.sync_to_async to run all CPU-bound code in ThreadPoolExecutor, I sped up my code 60%+ Commented Jan 10, 2021 at 23:44
0
  1. Every server will have latency differences between requests. However, the scale should be much lower then what you experience
  2. Your question does not have the server initialization code so can't know what config is used. I would start by looking at the thread pool size for the server. According to the docs the thread pool instance is a required argument so maybe try to set a different pool size. My guess is that the threads are exhausted and then the latency goes up since the request is waiting for a thread to free up

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.