Showing posts with label sysbench. Show all posts
Showing posts with label sysbench. Show all posts

Friday, June 19, 2026

CPU-bound sysbench on a large server: Postgres 12 to 19 beta1

This has results from sysbench on a small server with Postgres versions 12 through 19 beta1. Sysbench is run with high concurrency (40 connections) and a cached database. The purpose is to search for changes in performance.

Postgres remains boring, it is hard to find performance regressions.

tl;dr for Postgres 17 to 19

  • there are no regressions
  • throughput on the read-only-count test improves by ~3X in 19 beta1 thanks to a better query plan

tl;dr for Postgres 12 to 19

  • there are few regressions, throughput might have dropped by up to 5% on a few range query tests
  • there are a few large improvements for read-only tests
  • there are many large improvements for write-heavy tests

Builds, configuration and hardware

I compiled Postgres from source for versions 12.22, 13.23, 14.23, 15.18, 16.14, 17.10, 18.4 and 19 beta1.

I used a 48-core server from Hetzner
  • an ax162s with an AMD EPYC 9454P 48-Core Processor with SMT disabled
  • 2 Intel D7-P5520 NVMe storage devices with RAID 1 (3.8T each) using ext4
  • 128G RAM
  • Ubuntu 24.04
Configuration files for Postgres:
  • the config file is named conf.diff.cx10a_c32r128 (x10a_c32r128) and is here for versions 12, 13, 14, 15, 16 and 17.
  • for Postgres 18 and 19 I used conf.diff.cx10b_c32r128 (x10b_c32r128) which is as close as possible to the Postgres 17 config and uses io_method=sync
Benchmark

I used sysbench and my usage is explained here. I now run 32 of the 42 microbenchmarks listed in that blog post. Most test only one type of SQL statement. Benchmarks are run with the database cached by Postgres.

The read-heavy microbenchmarks are run for 600 seconds and the write-heavy for 1200 seconds. The benchmark is run with 40 clients and 8 tables with 10M rows per table. The database is cached.

The purpose is to search for regressions from new CPU overhead and mutex contention. I use the small server with low concurrency to find regressions from new CPU overheads and then larger servers with high concurrency to find regressions from new CPU overheads and mutex contention.

The tests can be called microbenchmarks. They are very synthetic. But microbenchmarks also make it easy to understand which types of SQL statements have great or lousy performance. Performance testing benefits from a variety of workloads -- both more and less synthetic.

Results

The microbenchmarks are split into 4 groups -- 1 for point queries, 2 for range queries, 1 for writes. For the range query microbenchmarks, part 1 has queries that don't do aggregation while part 2 has queries that do aggregation. 

I provide charts below with relative QPS (rQPS). The relative QPS is the following:
(QPS for some version) / (QPS for base version)
When the relative QPS is > 1 then some version is faster than base version.  When it is < 1 then there might be a regression. Values from iostat and vmstat divided by QPS are also provided here. These can help to explain why something is faster or slower because it shows how much HW is used per request.

Here, base version is either Postgres 12.23 or 17.10 and some version is a more recent version. I use 12.23 as the base version to identify regressions over a long period of time. And then I use 17.10 as the base version to confirm there aren't recent, large regressions.

I describe performance changes (changes to relative QPS) in terms of basis points. Performance changes by one basis point when the difference in rQPS is 0.01. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

Results: point queries, version 17 to 19

Summary:
  • there are no regressins
Relative to: PG 17.10
col-1 : PG 18.4
col-2 : PG 19 beta1

col-1   col-2
1.00    0.99    hot-points
1.01    1.01    point-query
1.00    1.00    points-covered-pk
0.98    0.99    points-covered-si
1.01    1.00    points-notcovered-pk
1.00    1.00    points-notcovered-si
1.01    1.01    random-points_range=10
1.02    1.00    random-points_range=100
1.00    1.00    random-points_range=1000

Results: point queries, version 12 to 19

Summary
  • there are no regressions
  • throughput for the hot-points test improves by ~2X in versions 17.10, 18.4 and 19beta
Relative to: PG 12.22
col-1 : PG 13.23
col-2 : PG 14.23
col-3 : PG 15.18
col-4 : PG 16.14
col-5 : PG 17.10
col-6 : PG 18.4
col-7 : PG 19 beta1

col-1   col-2   col-3   col-4   col-5   col-6   col-7
1.00    0.90    0.97    1.03    2.34    2.35    2.31    hot-points
1.00    1.01    1.03    1.04    1.03    1.04    1.03    point-query
1.02    1.04    1.04    1.07    1.04    1.04    1.04    points-covered-pk
1.01    1.07    1.04    1.04    1.04    1.03    1.04    points-covered-si
0.98    1.01    1.03    1.02    1.00    1.01    1.00    points-notcovered-pk
0.99    1.03    1.03    1.01    1.02    1.02    1.01    points-notcovered-si
0.99    1.01    1.03    1.03    1.00    1.01    1.01    random-points_range=10
0.99    1.02    1.04    1.04    1.01    1.03    1.01    random-points_range=100
1.00    1.02    1.02    1.03    1.01    1.02    1.01    random-points_range=1000

Results: range queries without aggregation, version 17 to 19

Summary
  • there are no regressions
  • while 19 beta1 has a better result on the scan test, that test has more variance with Postgres so I am reluctant to judge this without more results
Relative to: PG 17.10
col-1 : PG 18.4
col-2 : PG 19 beta1

col-1   col-2
0.98    0.99    range-covered-pk
0.97    0.99    range-covered-si
0.99    0.99    range-notcovered-pk
1.02    1.01    range-notcovered-si
0.96    1.07    scan

Results: range queries without aggregation, version 12 to 19

Summary
  • there are no regressions
  • scan throughput has improved a lot from version 12 to 19
Relative to: PG 12.22
col-1 : PG 13.23
col-2 : PG 14.23
col-3 : PG 15.18
col-4 : PG 16.14
col-5 : PG 17.10
col-6 : PG 18.4
col-7 : PG 19 beta1

col-1   col-2   col-3   col-4   col-5   col-6   col-7
0.99    1.03    1.04    1.04    1.03    1.00    1.02    range-covered-pk
0.99    1.04    1.04    1.04    1.03    1.00    1.03    range-covered-si
1.00    1.00    1.00    0.99    1.00    0.99    0.99    range-notcovered-pk
1.00    1.01    1.01    0.99    1.00    1.02    1.01    range-notcovered-si
1.09    1.27    1.10    1.21    1.19    1.14    1.28    scan

Results: range queries with aggregation, version 17 to 19

Summary
  • there are no regressions
  • throughput on the read-only-count test is ~3X better thanks to a new query plan. This improvement was also visible on my small server
Relative to: PG 17.10
col-1 : PG 18.4
col-2 : PG 19 beta1

col-1   col-2
1.03    3.30    read-only-count
1.02    0.99    read-only-distinct
1.00    0.97    read-only-order
0.99    0.99    read-only_range=10
0.99    0.99    read-only_range=100
1.01    1.00    read-only_range=10000
1.03    1.01    read-only-simple
1.03    1.01    read-only-sum

Results: range queries with aggregation, version 12 to 19

Summary
  • there might be a few small regressions, but losing 5% throughput from version 12 to 19 isn't a big deal
  • throughput on the read-only-count test is ~3X better thanks to a new query plan. This improvement was also visible on my small server
Relative to: PG 12.22
col-1 : PG 13.23
col-2 : PG 14.23
col-3 : PG 15.18
col-4 : PG 16.14
col-5 : PG 17.10
col-6 : PG 18.4
col-7 : PG 19 beta1

col-1   col-2   col-3   col-4   col-5   col-6   col-7
1.01    0.95    0.96    0.97    0.93    0.95    3.06    read-only-count
1.00    0.98    0.98    0.98    0.96    0.98    0.95    read-only-distinct
1.00    0.98    0.98    1.00    0.99    0.99    0.97    read-only-order
0.99    1.00    1.01    1.00    1.01    0.99    1.00    read-only_range=10
0.99    1.00    1.00    1.00    1.01    1.00    0.99    read-only_range=100
1.00    0.97    1.02    1.03    1.04    1.05    1.03    read-only_range=10000
1.00    0.97    0.99    0.97    0.95    0.98    0.96    read-only-simple
1.00    0.96    0.97    0.97    0.94    0.97    0.95    read-only-sum

Results: writes, version 17 to 19

Summary
  • there are no regressions
Relative to: PG 17.10
col-1 : PG 18.4
col-2 : PG 19 beta1

col-1   col-2
0.99    0.99    delete
1.02    1.02    insert
1.00    0.98    read-write_range=10
0.99    0.99    read-write_range=100
1.01    1.03    update-index
1.01    0.98    update-inlist
0.98    1.01    update-nonindex
1.01    1.03    update-one
1.00    1.00    update-zipf
0.97    0.99    write-only

Results: writes, version 12 to 19

Summary
  • there are no regressions
  • many large improvements arrived in version 17 and remain in 19 beta1
Relative to: PG 12.22
col-1 : PG 13.23
col-2 : PG 14.23
col-3 : PG 15.18
col-4 : PG 16.14
col-5 : PG 17.10
col-6 : PG 18.4
col-7 : PG 19 beta1

col-1   col-2   col-3   col-4   col-5   col-6   col-7
0.99    1.11    1.13    1.10    1.28    1.27    1.27    delete
1.02    1.17    1.16    1.19    1.23    1.25    1.25    insert
1.00    1.20    1.22    1.20    1.24    1.24    1.22    read-write_range=10
0.99    1.04    1.05    1.04    1.06    1.05    1.04    read-write_range=100
0.98    1.08    1.05    0.94    1.84    1.85    1.90    update-index
1.00    1.07    1.06    1.05    1.12    1.13    1.10    update-inlist
1.01    1.07    1.07    0.86    1.87    1.84    1.88    update-nonindex
1.04    0.96    0.96    1.10    1.39    1.41    1.43    update-one
1.01    1.05    1.07    0.96    1.63    1.62    1.63    update-zipf
0.99    1.11    1.13    1.09    1.41    1.37    1.40    write-only

Thursday, June 11, 2026

Write-heavy sysbench tests, a large server, modern Postgres and MySQL

This has results for modern Postgres and MySQL using write-heavy tests from sysbench and a large server. I think there are regressions in Postgres that arrive in some of versions 16, 17, 18 and 19 beta1 but I am far from certain and this blog post is just another step in my journey to figure that out.

tl;dr

  • Postgres suffers a lot from throughput variation while MySQL+InnoDB does not
  • InnoDB gets much better average throughput on 6 of 10 tests, similar throughput one one and then Postgres does better on 3 of 10 tests
  • For tests from which I provided vmstat and iostat results, Postgres does more write IO per operation. In some cases InnoDB uses more CPU, in other cases it does not.

Builds, configuration and hardware

I compiled:
  • Postgres from source for versions 15.17, 16.13, 17.9 and 18.3.
  • MySQL from source for version 8.4.7
I used a 48-core server from Hetzner
  • an ax162s with an AMD EPYC 9454P 48-Core Processor with SMT disabled
  • 2 Intel D7-P5520 NVMe storage devices with RAID 1 (3.8T each) using ext4
  • 128G RAM
  • Ubuntu 24.04
Configuration files for Postgres:
  • the config file is named conf.diff.cx10a_c32r128 (x10a_c32r128) and is here for versions 15, 16 and 17.
  • for Postgres 18 I used conf.diff.cx10b_c32r128 (x10b_c32r128) which is as close as possible to the Postgres 17 config and uses io_method=sync
Benchmark

I used sysbench and my usage is explained here. Normally I run 32 of the 42 microbenchmarks listed in that blog post using tables small enough to be cached by the DBMS. Most test only one type of SQL statement.

The tests can be called microbenchmarks. They are very synthetic. But microbenchmarks also make it easy to understand which types of SQL statements have great or lousy performance. Performance testing benefits from a variety of workloads -- both more and less synthetic.

But I did things differently here:
  • I only run the write-heavy tests (to save time)
  • The tables are larger than memory and cannot be cached
  • Each test (microbenchmark) is run for 2 hours when I normally run each for 15 minutes
  • After each test a vacuum is done
The purpose is to search for regressions from new CPU overhead and mutex contention related to MVCC GC (vacuum for Postgres, purge for InnoDB).

Results

I provide charts below with relative QPS. The relative QPS is the following:
(QPS for some version) / (QPS for Postgres 15.17)
When the relative QPS is > 1 then some version is faster than base version.  When it is < 1 then there might be a regression. When the relative QPS is 1.2 then some version is about 20% faster than base version.

The per-test results from vmstat and iostat can help to explain why something is faster or slower because it shows how much HW is used per request, including CPU overhead per operation (cpu/o) and context switches per operation (cs/o) which are often a proxy for mutex contention.

Results: writes

The table below has relative QPS for Postgres 16 to 19 and then InnoDB all relative to the throughput for Postgres 15.17. Columns 1 to 4 have results for Postgres and the numbers in yellow highlight the tests where there is a regression in Postgres. For column 5 (MySQL with InnoDB) the numbers in yellow and red indicate tests where InnoDB's throughput is less than Postgres. And then the numbers in green indicate tests where InnoDB's throughput is much larger than Postgres.

Note that when relative QPS (rQPS) is 0.90 then throughput dropped by ~10%.

Summary:
  • throughput for Postgres drops after version 15.17. I don't know yet whether this is a regression.
  • throughput for InnoDB is much better than Postgres in 6 of 10 tests, similar in one test, and much worse in 3 of 10 tests.
The sections that follow this one have more detail on results from the update-index, update-zipf tests and insert tests.

Relative to: Postgres 15.17
col-1 : Postgres 16.13
col-2 : Postgres 17.9
col-3 : Postgres 18.3
col-4 : Postgres 19 beta1
col-5 : MySQL 8.4.7

col-1   col-2   col-3   col-4   col-5
0.94    0.97    0.98    1.02    1.88    update-inlist
0.94    0.90    0.88    0.92    1.43    update-index
0.91    0.86    0.87    0.92    1.19    update-nonindex
0.96    0.99    0.98    0.98    0.71    update-one
0.92    0.83    0.81    0.85    0.93    update-zipf
0.95    0.93    0.84    0.81    1.71    write-only
0.94    0.94    0.90    0.92    1.14    read-write_range=10
0.95    0.96    0.95    0.95    1.93    read-write_range=100
0.89    0.82    0.80    0.84    1.01    delete
1.05    1.05    1.01    1.10    0.53    insert

Results: update-index

Summary:
  • Postgres suffers from too much variance
  • Average throughput is ~1.55X larger for InnoDB than for Postgres
  • Per operation, Postgres does ~1.20X more write IO (KB written) to storage than InnoDB
  • Per operation, InnoDB uses more CPU and does more context switches. While autovacuum was enabled and was likely running during the test, my measurements exclude the manual vacuum done at the end of each test.
iostat, vmstat normalized by operation rate
r/s     rMB/s   w/s     wMB/s   r/o     rKB/o   wKB/o   o/s     dbms
35503.0 373.7   58795.7 1345.1  1.375   14.824  53.351  25817   PG 19b1
33140.6 517.8   53449.6 1735.3  0.827   13.226  44.326  40090   MySQL 8.4.7

cs/s    cpu/s   cs/o    cpu/o   dbms
176167  14.4     6.824  .000557 PG 19b1
661395  41.9    16.498  .001046 MySQL 8.4.7

Results: update-zipf

Summary:
  • Postgres suffers from too much variance
  • Average throughput is ~1.09X larger for InnoDB than for Postgres
  • Per operation, Postgres does ~1.30X more write IO (KB written) to storage than InnoDB
  • Per operation, InnoDB uses more CPU and does more context switches. While autovacuum was enabled and was likely running during the test, my measurements exclude the manual vacuum done at the end of each test.
iostat, vmstat normalized by operation rate
r/s     rMB/s   w/s     wMB/s   r/o     rKB/o   wKB/o   o/s     dbms
55595.5 620.7   64264.4 1352.3  0.622   7.110   15.490  89396   PG 19b1
27405.9 428.2   37465.1 1133.6  0.282   4.508   11.933  97270   MySQL 8.4.7

cs/s    cpu/s   cs/o    cpu/o   dbms
424392  27.2     4.747  .000304 PG 19b1
1213054 44.5    12.471  .000458 MySQL 8.4.7

Results: insert

Summary:
  • Postgres suffers from too much variance
  • Average throughput is ~2.06X larger for Postgres than for InnoDB
  • Per operation, Postgres does ~1.67X more write IO (KB written) to storage than InnoDB
  • Per operation, Postgres uses more CPU and does more context switches. This is the opposite of what happens above for update-index and update-zipf.

iostat, vmstat normalized by operation rate
r/s     rMB/s   w/s     wMB/s   r/o     rKB/o   wKB/o   o/s     dbms
1615.5  56.0    15321.7 1170.9  0.007   0.242   5.059   237009  PG 19b1
3.6     0.1     8275.4  340.7   0.000   0.000   3.029   115155  MySQL 8.4.7

cs/s    cpu/s   cs/o    cpu/o   dbms
1214563 46.0    10.547  .000399 PG 19b1
800827  50.5     3.379  .000213 MySQL 8.4.7













Tuesday, June 9, 2026

Postgres 19 beta1 vs sysbench on a small server

This has results from sysbench on a small server with Postgres 19 beta1, 18.4 and 17.10. Sysbench is run with low concurrency (1 thread) and a cached database. The purpose is to search for changes in performance, often from new CPU overheads.

tl;dr

  • 19beta1, 18.4 and 17.10 have mostly similar performance
  • There might be small regressions (about 2%) from 17.10 to 19beta1 but my tests are not good at spotting that.
  • 19beta1 is much faster on one test (read-only-count) thanks to a new query plan

Builds, configuration and hardware

I compiled Postgres from source. 

The server is a Beelink SER7 7840HS with an AMD Ryzen 7 7840HS CPU and 32G RAM. Storage uses an NVMe device with ext-4 and discard enabled. The OS is Ubuntu 24.04. 

The config files are here for 17.10, 18.4 and 19 beta1.

Benchmark

I used sysbench and my usage is explained here. To save time I only run 32 of the 42 microbenchmarks 
and most test only 1 type of SQL statement. Benchmarks are run with the database cached by Postgres.

The read-heavy microbenchmarks run for 600 seconds and the write-heavy for 900 seconds.

The benchmark is run with 1 client, 1 table and 50M rows. The purpose is to search for CPU regressions.

Results

The microbenchmarks are split into 4 groups -- 1 for point queries, 2 for range queries, 1 for writes. For the range query microbenchmarks, part 1 has queries that don't do aggregation while part 2 has queries that do aggregation. 

I provide charts below with relative QPS (rQPS). The relative QPS is the following:
(QPS for some version) / (QPS for base version)
When the relative QPS is > 1 then some version is faster than base version.  When it is < 1 then there might be a regression. Values from iostat and vmstat divided by QPS are also provided here. These can help to explain why something is faster or slower because it shows how much HW is used per request.

Here, base version is Postgres 17.10 and some version is either 18.4 or 19 beta1.

I describe performance changes (changes to relative QPS) in terms of basis points. Performance changes by one basis point when the difference in rQPS is 0.01. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

Results: point queries

Summary:
  • 19beta1 is better than 17.10 by ~3 basis points for most tests
  • 19beta1 is slightly better than 18.4
Relative to Postgres 17.10
col-1 : Postgres 18.4
col-2 : Postgres 19 beta1

col-1   col-2
1.01    1.00    hot-points
0.98    0.97    point-query
1.01    1.03    points-covered-pk
1.00    1.04    points-covered-si
1.00    1.02    points-notcovered-pk
1.00    1.03    points-notcovered-si
0.99    0.99    random-points_range=10
1.00    1.03    random-points_range=100
1.01    1.03    random-points_range=1000

Results: range queries without aggregation

Summary:
  • 19beta1 is worse than 17.10 by ~3 basis points in 4 of 5 tests
  • 19beta1 is better than 17.10 by 5 basis points in the scan test
  • 19beta1 and 18.4 are similar except for the scan test where 19beta1 did better
Relative to Postgres 17.10
col-1 : Postgres 18.4
col-2 : Postgres 19 beta1

col-1   col-2
0.98    0.97    range-covered-pk
0.96    0.96    range-covered-si
0.98    0.98    range-notcovered-pk
0.99    0.99    range-notcovered-si
0.95    1.05    scan

Results: range queries with aggregation

Summary:
  • 19beta1 is worse than than 17.10 on two tests
  • 19beta1 is better than 17.10 on five tests
  • 19beta1 and 17.10 are the same on one test
  • 19beta1 is ~2.5X better than 17.10 on the read-only-count test
  • 19beta1 and 18.4 have similar results except for the read-only-count test
The query for the read-only-count test appears to have a different plan in 19beta1 and that might explain the ~2.5X speedup. In 17.10 and 18.4 it gets Index Scan while in 19beta1 it gets Index Only Scan.

Query plans for the read-only-count test ...

For 17.10
explain SELECT count(c) FROM sbtest1 WHERE id BETWEEN 17704460 AND 17705459
        Aggregate  (cost=1424.42..1424.43 rows=1 width=8)
          ->  Index Scan using sbtest1_pkey on sbtest1  (cost=0.56..1421.93 rows=996 width=121)
                Index Cond: ((id >= 17704460) AND (id <= 17705459))

For 18.4
explain SELECT count(c) FROM sbtest1 WHERE id BETWEEN 11575278 AND 11576277
        Aggregate  (cost=1310.09..1310.10 rows=1 width=8)
          ->  Index Scan using sbtest1_pkey on sbtest1  (cost=0.56..1307.89 rows=882 width=121)
                Index Cond: ((id >= 11575278) AND (id <= 11576277))

For 19beta1
explain SELECT count(c) FROM sbtest1 WHERE id BETWEEN 11686801 AND 11687800
        Aggregate  (cost=32.32..32.33 rows=1 width=8)
          ->  Index Only Scan using sbtest1_pkey on sbtest1  (cost=0.56..30.13 rows=878 width=0)
                Index Cond: ((id >= 11686801) AND (id <= 11687800))

Relative to Postgres 17.10
col-1 : Postgres 18.4
col-2 : Postgres 19 beta1

col-1   col-2
1.04    2.47    read-only-count
1.00    0.99    read-only-distinct
1.02    1.01    read-only-order
0.98    0.97    read-only_range=10
1.00    1.00    read-only_range=100
1.02    1.03    read-only_range=10000
1.09    1.09    read-only-simple
1.01    1.01    read-only-sum

Results: writes

Summary:
  • 19beta1 is worse than 17.10 by 2 to 5 basis points
  • 18.4 is worse than 17.10 by 2 to 3 basis points
Relative to Postgres 17.10
col-1 : Postgres 18.4
col-2 : Postgres 19 beta1

col-1   col-2
0.97    0.97    delete
0.99    0.96    insert
0.98    0.97    read-write_range=10
0.98    0.98    read-write_range=100
0.96    0.95    update-index
0.99    0.97    update-inlist
0.97    0.96    update-nonindex
0.97    0.95    update-one
0.97    0.95    update-zipf
0.98    0.97    write-only

Friday, April 10, 2026

MySQL 9.7.0 vs sysbench on a small server

This has results from sysbench on a small server with MySQL 9.7.0 and 8.4.8. Sysbench is run with low concurrency (1 thread) and a cached database. The purpose is to search for changes in performance, often from new CPU overheads.

I tested MySQL 9.7.0 with and without the hypergraph optimizer enabled. I don't expect it to help much because the queries run here are simple. I hope to learn it doesn't hurt performance in that case.

tl;dr

  • Throughput improves on two tests with the Hypergraph optimizer in 9.7.0 because they get better query plans.
  • One read-only test and several write-heavy tests have small regressions from 8.4.8 to 9.7.0. This might be from new CPU overheads but I don't see obvious problems in the flamegraphs. 

Builds, configuration and hardware

I compiled MySQL from source for versions \8.4.8 and 9.7.0.

The server is an ASUS ExpertCenter PN53 with AMD Ryzen 7 7735HS, 32G RAM and an m.2 device for the database. More details on it are here. The OS is Ubuntu 24.04 and the database filesystem is ext4 with discard enabled.

The my.cnf files os here for 8.4. I call this the z12a configs and variants of it are used for MySQL 5.6 through 8.4.

For 9.7 I use two configs:

All DBMS versions use the latin1 character set as explained here.

Benchmark

I used sysbench and my usage is explained here. To save time I only run 32 of the 42 microbenchmarks and most test only 1 type of SQL statement. Benchmarks are run with the database cached by InnoDB.

The tests are run using 1 table with 50M rows. The read-heavy microbenchmarks run for 600 seconds and the write-heavy for 1800 seconds.

Results

The microbenchmarks are split into 4 groups -- 1 for point queries, 2 for range queries, 1 for writes. For the range query microbenchmarks, part 1 has queries that don't do aggregation while part 2 has queries that do aggregation. 

I provide tables below with relative QPS. When the relative QPS is > 1 then some version is faster than the base version. When it is < 1 then there might be a regression.  The relative QPS (rQPS) is:
(QPS for some version) / (QPS for MySQL 8.4.8) 

Results: point queries

I describe performance changes (changes to relative QPS, rQPS) in terms of basis points. Performance changes by one basis point when the difference in rQPS is 0.01. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

This shows the rQPS for MySQL 9.7.0 using both the z13a and z13b configs. It is relative to the throughput from MySQL 8.4.8.
  • Throughput with MySQL 9.7.0 is similar to 8.4.8 except for point-query where there are regressions as rQPS drops by 5 and 7 basis points. The point-query test uses simple queries that fetch one column from one row by PK. From vmstat metrics the CPU overhead per query for 9.7.0 is ~8% larger than for 8.4.8, with and without the hypergraph optimizer. I don't see anything obvious in the flamegraphs.
z13a    z13b
0.99    1.01    hot-points
0.95    0.93    point-query
0.99    1.01    points-covered-pk
1.00    1.01    points-covered-si
0.98    1.00    points-notcovered-pk
0.99    1.01    points-notcovered-si
1.00    1.02    random-points_range=1000
0.99    1.01    random-points_range=100
0.96    1.00    random-points_range=10

Results: range queries without aggregation

I describe performance changes (changes to relative QPS, rQPS) in terms of basis points. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

This shows the rQPS for MySQL 9.7.0 using both the z13a and z13b configs. It is relative to the throughput from MySQL 8.4.8.
  • Throughput with MySQL 9.7.0 is similar to 8.4.8. I am skeptical there is a regression for the scan test with the z13b config. I suspect that is noise.
z13a    z13b
0.99    0.99    range-covered-pk
0.99    0.99    range-covered-si
0.99    0.99    range-notcovered-pk
0.98    0.98    range-notcovered-si
1.00    0.96    scan

Results: range queries with aggregation

I describe performance changes (changes to relative QPS, rQPS) in terms of basis points. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

This shows the rQPS for MySQL 9.7.0 using both the z13a and z13b configs. It is relative to the throughput from MySQL 8.4.8.
  • There might be small regressions in several tests with rQPS dropping by a few points but I will ignore that for now.
  • There is a large improvement for the read-only-distinct test with the z13b config. The query for this test is select distinct c from sbtest where id between ? and ? order by c. The reason for the performance improvment is that the hypergraph optimizer chooses a better plan, see here.
  • There is a large improvement for the read-only test with range=10000. This test uses the read-only version of the classic sysbench transaction (see here). One of the queries it runs is the query used by read-only-distinct. So it benefits from the better plan for that query. 
z13a    z13b
0.97    0.97    read-only-count
0.98    1.26    read-only-distinct
0.96    0.95    read-only-order
0.99    1.15    read-only_range=10000
0.97    1.00    read-only_range=100
0.96    0.97    read-only_range=10
0.99    0.99    read-only-simple
0.97    0.96    read-only-sum

Results: writes

I describe performance changes (changes to relative QPS, rQPS) in terms of basis points. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

This shows the rQPS for MySQL 9.7.0 using both the z13a and z13b configs. It is relative to the throughput from MySQL 8.4.8.
  • There might be several small regressions here. I don't see obvious problems in the flamegraphs.
z13a    z13b
0.95    0.92    delete
1.00    1.01    insert
0.97    0.98    read-write_range=100
0.96    0.95    read-write_range=10
0.97    0.96    update-index
0.97    0.92    update-inlist
0.95    0.93    update-nonindex
0.95    0.92    update-one
0.95    0.93    update-zipf
0.97    0.95    write-only

Thursday, April 9, 2026

Sysbench vs MySQL on a small server: another way to view the regressions

This post provides another way to see the performance regressions in MySQL from versions 5.6 to 9.7. It complements what I shared in a recent post. The workload here is cached by InnoDB and my focus is on regressions from new CPU overheads. 

The good news is that there are few regressions after 8.0. The bad news is that there were many prior to that and these are unlikely to be undone.

    tl;dr

    • for point queries
      • there are large regressions from 5.6.51 to 5.7.44, 5.7.44 to 8.0.28 and 8.0.28 to 8.0.45
      • there are few regressions from 8.0.45 to 8.4.8 to 9.7.0
    • for range queries without aggregation
      • there are large regressions from 5.6.51 to 5.7.44 and 5.7.44 to 8.0.28
      • there are mostly small regressions from 8.0.28 to 8.0.45, but scan has a large regression
      • there are few regressions from 8.0.45 to 8.4.8 to 9.7.0
    • for range queries with aggregation
      • there are large regressions from 5.6.51 to 5.7.44 with two improvements
      • there are large regressions from 5.7.44 to 8.0.28
      • there are small regressions from 8.0.28 to 8.0.45
      • there are few regressions from 8.0.45 to 8.4.8 to 9.7.0
    • for writes
      • there are large regressions from 5.6.51 to 5.7.44 and 5.7.44 to 8.0.28
      • there are small regressions from 8.0.28 to 8.0.45
      • there are few regressions from 8.0.45 to 8.4.8
      • there are a few small regressions from 8.4.8 to 9.7.0

    Builds, configuration and hardware

    I compiled MySQL from source for versions 5.6.51, 5.7.44, 8.0.28, 8.0.45, 8.4.8 and 9.7.0.

    The server is an ASUS ExpertCenter PN53 with AMD Ryzen 7 7735HS, 32G RAM and an m.2 device for the database. More details on it are here. The OS is Ubuntu 24.04 and the database filesystem is ext4 with discard enabled.

    The my.cnf files are here for 5.65.7 and 8.4. I call these the z12a configs.

    For 9.7 I use the z13a config. It is as close as possible to z12a and adds two options for gtid-related features to undo a default config change that arrived in 9.6. 

    All DBMS versions use the latin1 character set as explained here.

    Benchmark

    I used sysbench and my usage is explained here. To save time I only run 32 of the 42 microbenchmarks and most test only 1 type of SQL statement. Benchmarks are run with the database cached by InnoDB.

    The tests are run using 1 table with 50M rows. The read-heavy microbenchmarks run for 600 seconds and the write-heavy for 1800 seconds.

    Results

    The microbenchmarks are split into 4 groups -- 1 for point queries, 2 for range queries, 1 for writes. For the range query microbenchmarks, part 1 has queries that don't do aggregation while part 2 has queries that do aggregation. 

    I provide tables below with relative QPS. When the relative QPS is > 1 then some version is faster than the base version. When it is < 1 then there might be a regression.  The relative QPS (rQPS) is:
    (QPS for some version) / (QPS for base version) 
    Results: point queries

    MySQL 5.6.51 gets from 1.18X to 1.61X more QPS than 9.7.0 on point queries. It is easier for me to write about this in terms of relative QPS (rQPS) which is as low as 0.62 for MySQL 9.7.0 vs 5.6.51. I define a basis point to mean a change of 0.01 in rQPS.

    Summary:
    • from 5.6.51 to 9.7.0
      • the median regression is a drop in rQPS of 27 basis points
    • from 5.6.51 to 5.7.44
      • the median regression is a drop in rQPS of 11 basis points
    • from 5.7.44 to 8.0.28
      • the median regression is a drop in rQPS of 25 basis points
    • from 8.0.28 to 8.0.45
      • 7 of 9 tests get more QPS with 8.0.45
      • 2 tests have regressions where rQPS drops by ~6 basis points
    • from 8.0.45 to 8.4.8
      • there are few regressions
    • from 8.4.8 to 9.7.0
      • there are few regressions
    This has (QPS for 9.7.0) / (QPS for 5.6.51) and is followed by tables that show the difference between the latest point release in adjacent versions.
    • the largest regression is an rQPS drop of 38 basis points for point-query. Compared to most of the other tests in this section, this query does less work in the storage engine which implies the regression is from code above the storage engine.
    • the smallest regression is an rQPS drop of 15 basis points for random-points_range=1000. The regression for the same query with a shorter range (=10, =100) is larger. That implies, at least for this query, that the regression is for something above the storage engine (optimizer, parser, etc).
    • the median regression is an rQPS drop of 27 basis points
    0.65    hot-points
    0.62    point-query
    0.72    points-covered-pk
    0.78    points-covered-si
    0.73    points-notcovered-pk
    0.76    points-notcovered-si
    0.85    random-points_range=1000
    0.73    random-points_range=100
    0.66    random-points_range=10

    This has: (QPS for 5.7.44) / (QPS for 5.6.51)
    • the largest regression is an rQPS drop of 14 basis points for hot-points.
    • the next largest regression is an rQPS drop of 13 basis points for random-points with range=10. The regressions for that query are smaller when a larger range is used =100, =1000 and this implies the problem is above the storage engine. 
    • the median regression is an rQPS drop of 11 basis points
    0.86    hot-points
    0.90    point-query
    0.89    points-covered-pk
    0.90    points-covered-si
    0.89    points-notcovered-pk
    0.88    points-notcovered-si
    1.00    random-points_range=1000
    0.89    random-points_range=100
    0.87    random-points_range=10

    This has: (QPS for 8.0.28) / (QPS for 5.7.44)
    • the largest regression is an rQPS drop of 66 basis points for random-points with range=1000. The regression for that same query with smaller ranges (=10, =100) is smaller. This implies the problem is in the storage engine.
    • the second largest regression is an rQPS drop of 35 basis points for hot-points
    • the median regression is an rQPS drop of 25 basis points
    0.65    hot-points
    0.82    point-query
    0.74    points-covered-pk
    0.75    points-covered-si
    0.76    points-notcovered-pk
    0.84    points-notcovered-si
    0.34    random-points_range=1000
    0.75    random-points_range=100
    0.86    random-points_range=10

    This has: (QPS for 8.0.45) / (QPS for 8.0.28)
    • at last, there are many improvements. Some are from a fix for bug 102037 which I found with help from sysbench
    • the regressions, with rQPS drops by ~6 basis points, are for queries that do less work in the storage engine relative to the other tests in this section
    1.20    hot-points
    0.93    point-query
    1.13    points-covered-pk
    1.19    points-covered-si
    1.09    points-notcovered-pk
    1.04    points-notcovered-si
    2.48    random-points_range=1000
    1.12    random-points_range=100
    0.94    random-points_range=10

    This has: (QPS for 8.4.8) / (QPS for 8.0.45)
    • there are few regressions from 8.0.45 to 8.4.8
    0.99    hot-points
    0.96    point-query
    0.99    points-covered-pk
    0.98    points-covered-si
    1.00    points-notcovered-pk
    0.99    points-notcovered-si
    1.00    random-points_range=1000
    1.00    random-points_range=100
    0.98    random-points_range=10

    This has: (QPS for 9.7.0) / (QPS for 8.4.8)
    • there are few regressions from 8.4.8 to 9.7.0
    0.99    hot-points
    0.95    point-query
    0.99    points-covered-pk
    1.00    points-covered-si
    0.98    points-notcovered-pk
    0.99    points-notcovered-si
    1.00    random-points_range=1000
    0.99    random-points_range=100
    0.96    random-points_range=10

    Results: range queries without aggregation

    MySQL 5.6.51 gets from 1.35X to 1.52X more QPS than 9.7.0 on range queries without aggregation. It is easier for me to write about this in terms of relative QPS (rQPS) which is as low as 0.66 for MySQL 9.7.0 vs 5.6.51. I define a basis point to mean a change of 0.01 in rQPS.

    Summary:
    • from 5.6.51 to 9.7.0
      • the median regression is drop in rQPS of 33 basis points
    • from 5.6.51 to 5.7.44
      • the median regression is a drop in rQPS of 16 basis points
    • from 5.7.44 to 8.0.28
      • the median regression is a drop in rQPS ~10 basis points
    • from 8.0.28 to 8.0.45
      • the median regression is a drop in rQPS of 5 basis points
    • from 8.0.45 to 8.4.8
      • there are few regressions from 8.0.45 to 8.4.8
    • from 8.4.8 to 9.7.0
      • there are few regressions from 8.4.8 to 9.7.0
    This has (QPS for 9.7.0) / (QPS for 5.6.51) and is followed by tables that show the difference between the latest point release in adjacent versions.
    • all tests have large regressions with an rQPS drop that ranges from 26 to 34 basis points
    • the median regression is an rQPS drop of 33 basis points
    0.66    range-covered-pk
    0.67    range-covered-si
    0.66    range-notcovered-pk
    0.74    range-notcovered-si
    0.67    scan

    This has: (QPS for 5.7.44) / (QPS for 5.6.51)
    • all tests have large regressions with an rQPS drop that ranges from 12 to 17 basis points
    • the median regression is an rQPS drop of 16 basis points
    0.85    range-covered-pk
    0.84    range-covered-si
    0.84    range-notcovered-pk
    0.88    range-notcovered-si
    0.83    scan

    This has: (QPS for 8.0.28) / (QPS for 5.7.44)
    • 4 of 5 tests have regressions with an rQPS drop that ranges from 10 to 14 basis points
    • the median regression is ~10 basis points
    • rQPS improves for the scan test
    0.86    range-covered-pk
    0.89    range-covered-si
    0.90    range-notcovered-pk
    0.90    range-notcovered-si
    1.04    scan

    This has: (QPS for 8.0.45) / (QPS for 8.0.28)
    • all tests are slower in 8.0.45 than 8.0.28, but the regression for 3 of 5 is <= 5 basis points
    • rQPS in the scan test drops by 21 basis points
    • the median regression is an rQPS drop of 5 basis points
    0.96    range-covered-pk
    0.95    range-covered-si
    0.91    range-notcovered-pk
    0.96    range-notcovered-si
    0.79    scan

    This has: (QPS for 8.4.8) / (QPS for 8.0.45)
    • there are few regressions from 8.0.45 to 8.4.8
    0.95    range-covered-pk
    0.95    range-covered-si
    0.98    range-notcovered-pk
    0.99    range-notcovered-si
    0.98    scan

    This has: (QPS for 9.7.0) / (QPS for 8.4.8)
    • there are few regressions from 8.4.8 to 9.7.0
    0.99    range-covered-pk
    0.99    range-covered-si
    0.99    range-notcovered-pk
    0.98    range-notcovered-si
    1.00    scan

    Results: range queries with aggregation

    Summary:
    • from 5.6.51 to 9.7.0 rQPS
      • the median result is a drop in rQPS of ~30 basis points
    • from 5.6.51 to 5.7.44
      • the median result is a drop in rQPS of ~10 basis points
    • from 5.7.44 to 8.0.28
      • the median result is a drop in rQPS of ~12 basis points
    • from 8.0.28 to 8.0.45
      • the median result is an rQPS drop of 5 basis points
    • from 8.0.45 to 8.4.8
      • there are few regressions from 8.0.45 to 8.4.8
    • from 8.4.8 to 9.7.0
      • there are few regressions from 8.4.8 to 9.7.0
    This has (QPS for 9.7.0) / (QPS for 5.6.51) and is followed by tables that show the difference between the latest point release in adjacent versions.
    • the median result is a drop in rQPS of ~30 basis points
    • rQPS for the read-only-distinct test improves by 25 basis point
    0.67    read-only-count
    1.25    read-only-distinct
    0.75    read-only-order
    1.02    read-only_range=10000
    0.74    read-only_range=100
    0.66    read-only_range=10
    0.69    read-only-simple
    0.66    read-only-sum

    This has: (QPS for 5.7.44) / (QPS for 5.6.51)
    • the median result is an rQPS drop of ~10 basis points
    • rQPS improves by 45 basis points for read-only-distinct and by 23 basis points for read-only with the largest range (=10000)
    0.86    read-only-count
    1.45    read-only-distinct
    0.93    read-only-order
    1.23    read-only_range=10000
    0.96    read-only_range=100
    0.88    read-only_range=10
    0.85    read-only-simple
    0.86    read-only-sum

    This has: (QPS for 8.0.28) / (QPS for 5.7.44)
    • the median result is an rQPS drop of ~12 basis points
    0.91    read-only-count
    0.94    read-only-distinct
    0.89    read-only-order
    0.86    read-only_range=10000
    0.87    read-only_range=100
    0.85    read-only_range=10
    0.90    read-only-simple
    0.87    read-only-sum

    This has: (QPS for 8.0.45) / (QPS for 8.0.28)
    • the median result is an rQPS drop of 5 basis points
    0.89    read-only-count
    0.95    read-only-distinct
    0.95    read-only-order
    0.97    read-only_range=10000
    0.94    read-only_range=100
    0.95    read-only_range=10
    0.93    read-only-simple
    0.93    read-only-sum

    This has: (QPS for 8.4.8) / (QPS for 8.0.45)
    • there are few regressions from 8.0.45 to 8.4.8
    0.99    read-only-count
    0.98    read-only-distinct
    0.99    read-only-order
    1.00    read-only_range=10000
    0.98    read-only_range=100
    0.97    read-only_range=10
    0.97    read-only-simple
    0.98    read-only-sum

    This has: (QPS for 9.7.0) / (QPS for 8.4.8)
    • there are few regressions from 8.4.8 to 9.7.0
    0.97    read-only-count
    0.98    read-only-distinct
    0.96    read-only-order
    0.99    read-only_range=10000
    0.97    read-only_range=100
    0.96    read-only_range=10
    0.99    read-only-simple
    0.97    read-only-sum

    Results: writes

    Summary:
    • from 5.6.51 to 9.7.0 rQPS 
      • the median result is a drop in rQPS of ~33 basis points
    • from 5.6.51 to 5.7.44
      • the median result is an rQPS drop of ~13 basis points
    • from 5.7.44 to 8.0.28
      • the median result is an rQPS drop of ~18 basis points
    • from 8.0.28 to 8.0.45
      • the median result is an rQPS drop of 9 basis points
    • from 8.0.45 to 8.4.8
      • there are few regressions from 8.0.45 to 8.4.8
    • from 8.4.8 to 9.7.0
      • the median result is an rQPS drop of 4 basis points
    This has (QPS for 9.7.0) / (QPS for 5.6.51) and is followed by tables that show the difference between the latest point release in adjacent versions.
    • the median result is an rQPS drop of ~33 basis points
    0.56    delete
    0.54    insert
    0.72    read-write_range=100
    0.66    read-write_range=10
    0.88    update-index
    0.74    update-inlist
    0.60    update-nonindex
    0.58    update-one
    0.60    update-zipf
    0.67    write-only

    This has: (QPS for 5.7.44) / (QPS for 5.6.51)
    • the median result is an rQPS drop of ~13 basis points
    • rQPS improves by 21 basis points for update-index and by 5 basis points for update-inlist
    0.82    delete
    0.80    insert
    0.94    read-write_range=100
    0.88    read-write_range=10
    1.21    update-index
    1.05    update-inlist
    0.86    update-nonindex
    0.85    update-one
    0.86    update-zipf
    0.94    write-only

    This has: (QPS for 8.0.28) / (QPS for 5.7.44)
    • the median result is an rQPS drop of ~18 basis points
    0.80    delete
    0.77    insert
    0.87    read-write_range=100
    0.85    read-write_range=10
    0.94    update-index
    0.79    update-inlist
    0.81    update-nonindex
    0.80    update-one
    0.81    update-zipf
    0.83    write-only

    This has: (QPS for 8.0.45) / (QPS for 8.0.28)
    • the median result is an rQPS drop of 9 basis points
    0.91    delete
    0.90    insert
    0.94    read-write_range=100
    0.94    read-write_range=10
    0.80    update-index
    0.92    update-inlist
    0.91    update-nonindex
    0.92    update-one
    0.91    update-zipf
    0.89    write-only

    This has: (QPS for 8.4.8) / (QPS for 8.0.45)
    • there are few regressions from 8.0.45 to 8.4.8
    0.98    delete
    0.98    insert
    0.98    read-write_range=100
    0.98    read-write_range=10
    0.99    update-index
    0.99    update-inlist
    0.99    update-nonindex
    0.99    update-one
    0.99    update-zipf
    0.99    write-only

    This has: (QPS for 9.7.0) / (QPS for 8.4.8)
    • the median result is an rQPS drop of 4 basis points
    0.95    delete
    1.00    insert
    0.97    read-write_range=100
    0.96    read-write_range=10
    0.97    update-index
    0.97    update-inlist
    0.95    update-nonindex
    0.95    update-one
    0.95    update-zipf
    0.97    write-only

    CPU-bound sysbench on a large server: Postgres 12 to 19 beta1

    This has results from sysbench on a small server with Postgres versions 12 through 19 beta1. Sysbench is run with high concurrency (40 conne...