Small Datum: sysbench

Showing posts with label sysbench. Show all posts

Friday, June 19, 2026

CPU-bound sysbench on a large server: Postgres 12 to 19 beta1

This has results from sysbench on a small server with Postgres versions 12 through 19 beta1. Sysbench is run with high concurrency (40 connections) and a cached database. The purpose is to search for changes in performance.

Postgres remains boring, it is hard to find performance regressions.

tl;dr for Postgres 17 to 19

there are no regressions
throughput on the read-only-count test improves by ~3X in 19 beta1 thanks to a better query plan

tl;dr for Postgres 12 to 19

there are few regressions, throughput might have dropped by up to 5% on a few range query tests
there are a few large improvements for read-only tests
there are many large improvements for write-heavy tests

Builds, configuration and hardware

I compiled Postgres from source for versions 12.22, 13.23, 14.23, 15.18, 16.14, 17.10, 18.4 and 19 beta1.

I used a 48-core server from Hetzner

an ax162s with an AMD EPYC 9454P 48-Core Processor with SMT disabled
2 Intel D7-P5520 NVMe storage devices with RAID 1 (3.8T each) using ext4
128G RAM
Ubuntu 24.04

Configuration files for Postgres:

the config file is named conf.diff.cx10a_c32r128 (x10a_c32r128) and is here for versions 12, 13, 14, 15, 16 and 17.
for Postgres 18 and 19 I used conf.diff.cx10b_c32r128 (x10b_c32r128) which is as close as possible to the Postgres 17 config and uses io_method=sync

Benchmark

I used sysbench and my usage is explained here. I now run 32 of the 42 microbenchmarks listed in that blog post. Most test only one type of SQL statement. Benchmarks are run with the database cached by Postgres.

The read-heavy microbenchmarks are run for 600 seconds and the write-heavy for 1200 seconds. The benchmark is run with 40 clients and 8 tables with 10M rows per table. The database is cached.

The purpose is to search for regressions from new CPU overhead and mutex contention. I use the small server with low concurrency to find regressions from new CPU overheads and then larger servers with high concurrency to find regressions from new CPU overheads and mutex contention.

The tests can be called microbenchmarks. They are very synthetic. But microbenchmarks also make it easy to understand which types of SQL statements have great or lousy performance. Performance testing benefits from a variety of workloads -- both more and less synthetic.

Results

The microbenchmarks are split into 4 groups -- 1 for point queries, 2 for range queries, 1 for writes. For the range query microbenchmarks, part 1 has queries that don't do aggregation while part 2 has queries that do aggregation.

I provide charts below with relative QPS (rQPS). The relative QPS is the following:

(QPS for some version) / (QPS for base version)

When the relative QPS is > 1 then some version is faster than base version. When it is < 1 then there might be a regression. Values from iostat and vmstat divided by QPS are also provided here. These can help to explain why something is faster or slower because it shows how much HW is used per request.

Here, base version is either Postgres 12.23 or 17.10 and some version is a more recent version. I use 12.23 as the base version to identify regressions over a long period of time. And then I use 17.10 as the base version to confirm there aren't recent, large regressions.

I describe performance changes (changes to relative QPS) in terms of basis points. Performance changes by one basis point when the difference in rQPS is 0.01. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

Results: point queries, version 17 to 19

Summary:

there are no regressins

Relative to: PG 17.10

col-1 : PG 18.4

col-2 : PG 19 beta1

col-1 col-2

1.00 0.99 hot-points

1.01 1.01 point-query

1.00 1.00 points-covered-pk

0.98 0.99 points-covered-si

1.01 1.00 points-notcovered-pk

1.00 1.00 points-notcovered-si

1.01 1.01 random-points_range=10

1.02 1.00 random-points_range=100

1.00 1.00 random-points_range=1000

Results: point queries, version 12 to 19

Summary

there are no regressions
throughput for the hot-points test improves by ~2X in versions 17.10, 18.4 and 19beta

Relative to: PG 12.22

col-1 : PG 13.23

col-2 : PG 14.23

col-3 : PG 15.18

col-4 : PG 16.14

col-5 : PG 17.10

col-6 : PG 18.4

col-7 : PG 19 beta1

col-1 col-2 col-3 col-4 col-5 col-6 col-7

1.00 0.90 0.97 1.03 2.34 2.35 2.31 hot-points

1.00 1.01 1.03 1.04 1.03 1.04 1.03 point-query

1.02 1.04 1.04 1.07 1.04 1.04 1.04 points-covered-pk

1.01 1.07 1.04 1.04 1.04 1.03 1.04 points-covered-si

0.98 1.01 1.03 1.02 1.00 1.01 1.00 points-notcovered-pk

0.99 1.03 1.03 1.01 1.02 1.02 1.01 points-notcovered-si

0.99 1.01 1.03 1.03 1.00 1.01 1.01 random-points_range=10

0.99 1.02 1.04 1.04 1.01 1.03 1.01 random-points_range=100

1.00 1.02 1.02 1.03 1.01 1.02 1.01 random-points_range=1000

Results: range queries without aggregation, version 17 to 19

Summary

there are no regressions
while 19 beta1 has a better result on the scan test, that test has more variance with Postgres so I am reluctant to judge this without more results

Relative to: PG 17.10

col-1 : PG 18.4

col-2 : PG 19 beta1

col-1 col-2

0.98 0.99 range-covered-pk

0.97 0.99 range-covered-si

0.99 0.99 range-notcovered-pk

1.02 1.01 range-notcovered-si

0.96 1.07 scan

Results: range queries without aggregation, version 12 to 19

Summary

there are no regressions
scan throughput has improved a lot from version 12 to 19

Relative to: PG 12.22

col-1 : PG 13.23

col-2 : PG 14.23

col-3 : PG 15.18

col-4 : PG 16.14

col-5 : PG 17.10

col-6 : PG 18.4

col-7 : PG 19 beta1

col-1 col-2 col-3 col-4 col-5 col-6 col-7

0.99 1.03 1.04 1.04 1.03 1.00 1.02 range-covered-pk

0.99 1.04 1.04 1.04 1.03 1.00 1.03 range-covered-si

1.00 1.00 1.00 0.99 1.00 0.99 0.99 range-notcovered-pk

1.00 1.01 1.01 0.99 1.00 1.02 1.01 range-notcovered-si

1.09 1.27 1.10 1.21 1.19 1.14 1.28 scan

Results: range queries with aggregation, version 17 to 19

Summary

there are no regressions
throughput on the read-only-count test is ~3X better thanks to a new query plan. This improvement was also visible on my small server

Relative to: PG 17.10

col-1 : PG 18.4

col-2 : PG 19 beta1

col-1 col-2

1.03 3.30 read-only-count

1.02 0.99 read-only-distinct

1.00 0.97 read-only-order

0.99 0.99 read-only_range=10

0.99 0.99 read-only_range=100

1.01 1.00 read-only_range=10000

1.03 1.01 read-only-simple

1.03 1.01 read-only-sum

Results: range queries with aggregation, version 12 to 19

Summary

there might be a few small regressions, but losing 5% throughput from version 12 to 19 isn't a big deal
throughput on the read-only-count test is ~3X better thanks to a new query plan. This improvement was also visible on my small server

Relative to: PG 12.22

col-1 : PG 13.23

col-2 : PG 14.23

col-3 : PG 15.18

col-4 : PG 16.14

col-5 : PG 17.10

col-6 : PG 18.4

col-7 : PG 19 beta1

col-1 col-2 col-3 col-4 col-5 col-6 col-7

1.01 0.95 0.96 0.97 0.93 0.95 3.06 read-only-count

1.00 0.98 0.98 0.98 0.96 0.98 0.95 read-only-distinct

1.00 0.98 0.98 1.00 0.99 0.99 0.97 read-only-order

0.99 1.00 1.01 1.00 1.01 0.99 1.00 read-only_range=10

0.99 1.00 1.00 1.00 1.01 1.00 0.99 read-only_range=100

1.00 0.97 1.02 1.03 1.04 1.05 1.03 read-only_range=10000

1.00 0.97 0.99 0.97 0.95 0.98 0.96 read-only-simple

1.00 0.96 0.97 0.97 0.94 0.97 0.95 read-only-sum

Results: writes, version 17 to 19

Summary

there are no regressions

Relative to: PG 17.10

col-1 : PG 18.4

col-2 : PG 19 beta1

col-1 col-2

0.99 0.99 delete

1.02 1.02 insert

1.00 0.98 read-write_range=10

0.99 0.99 read-write_range=100

1.01 1.03 update-index

1.01 0.98 update-inlist

0.98 1.01 update-nonindex

1.01 1.03 update-one

1.00 1.00 update-zipf

0.97 0.99 write-only

Results: writes, version 12 to 19

Summary

there are no regressions
many large improvements arrived in version 17 and remain in 19 beta1

Relative to: PG 12.22

col-1 : PG 13.23

col-2 : PG 14.23

col-3 : PG 15.18

col-4 : PG 16.14

col-5 : PG 17.10

col-6 : PG 18.4

col-7 : PG 19 beta1

col-1 col-2 col-3 col-4 col-5 col-6 col-7

0.99 1.11 1.13 1.10 1.28 1.27 1.27 delete

1.02 1.17 1.16 1.19 1.23 1.25 1.25 insert

1.00 1.20 1.22 1.20 1.24 1.24 1.22 read-write_range=10

0.99 1.04 1.05 1.04 1.06 1.05 1.04 read-write_range=100

0.98 1.08 1.05 0.94 1.84 1.85 1.90 update-index

1.00 1.07 1.06 1.05 1.12 1.13 1.10 update-inlist

1.01 1.07 1.07 0.86 1.87 1.84 1.88 update-nonindex

1.04 0.96 0.96 1.10 1.39 1.41 1.43 update-one

1.01 1.05 1.07 0.96 1.63 1.62 1.63 update-zipf

0.99 1.11 1.13 1.09 1.41 1.37 1.40 write-only

Thursday, June 11, 2026

Write-heavy sysbench tests, a large server, modern Postgres and MySQL

This has results for modern Postgres and MySQL using write-heavy tests from sysbench and a large server. I think there are regressions in Postgres that arrive in some of versions 16, 17, 18 and 19 beta1 but I am far from certain and this blog post is just another step in my journey to figure that out.

tl;dr

Postgres suffers a lot from throughput variation while MySQL+InnoDB does not
InnoDB gets much better average throughput on 6 of 10 tests, similar throughput one one and then Postgres does better on 3 of 10 tests
For tests from which I provided vmstat and iostat results, Postgres does more write IO per operation. In some cases InnoDB uses more CPU, in other cases it does not.

Builds, configuration and hardware

I compiled:

Postgres from source for versions 15.17, 16.13, 17.9 and 18.3.
MySQL from source for version 8.4.7

I used a 48-core server from Hetzner

an ax162s with an AMD EPYC 9454P 48-Core Processor with SMT disabled
2 Intel D7-P5520 NVMe storage devices with RAID 1 (3.8T each) using ext4
128G RAM
Ubuntu 24.04

Configuration files for Postgres:

the config file is named conf.diff.cx10a_c32r128 (x10a_c32r128) and is here for versions 15, 16 and 17.
for Postgres 18 I used conf.diff.cx10b_c32r128 (x10b_c32r128) which is as close as possible to the Postgres 17 config and uses io_method=sync

Benchmark

I used sysbench and my usage is explained here. Normally I run 32 of the 42 microbenchmarks listed in that blog post using tables small enough to be cached by the DBMS. Most test only one type of SQL statement.

But I did things differently here:

I only run the write-heavy tests (to save time)
The tables are larger than memory and cannot be cached
Each test (microbenchmark) is run for 2 hours when I normally run each for 15 minutes
After each test a vacuum is done

The purpose is to search for regressions from new CPU overhead and mutex contention related to MVCC GC (vacuum for Postgres, purge for InnoDB).

Results

I provide charts below with relative QPS. The relative QPS is the following:

(QPS for some version) / (QPS for Postgres 15.17)

When the relative QPS is > 1 then some version is faster than base version. When it is < 1 then there might be a regression. When the relative QPS is 1.2 then some version is about 20% faster than base version.

The per-test results from vmstat and iostat can help to explain why something is faster or slower because it shows how much HW is used per request, including CPU overhead per operation (cpu/o) and context switches per operation (cs/o) which are often a proxy for mutex contention.

Results: writes

The table below has relative QPS for Postgres 16 to 19 and then InnoDB all relative to the throughput for Postgres 15.17. Columns 1 to 4 have results for Postgres and the numbers in yellow highlight the tests where there is a regression in Postgres. For column 5 (MySQL with InnoDB) the numbers in yellow and red indicate tests where InnoDB's throughput is less than Postgres. And then the numbers in green indicate tests where InnoDB's throughput is much larger than Postgres.

Note that when relative QPS (rQPS) is 0.90 then throughput dropped by ~10%.

Summary:

throughput for Postgres drops after version 15.17. I don't know yet whether this is a regression.
throughput for InnoDB is much better than Postgres in 6 of 10 tests, similar in one test, and much worse in 3 of 10 tests.

The sections that follow this one have more detail on results from the update-index, update-zipf tests and insert tests.

Relative to: Postgres 15.17

col-1 : Postgres 16.13

col-2 : Postgres 17.9

col-3 : Postgres 18.3

col-4 : Postgres 19 beta1

col-5 : MySQL 8.4.7

col-1 col-2 col-3 col-4 col-5

0.94 0.97 0.98 1.02 1.88 update-inlist

0.94 0.90 0.88 0.92 1.43 update-index

0.91 0.86 0.87 0.92 1.19 update-nonindex

0.96 0.99 0.98 0.98 0.71 update-one

0.92 0.83 0.81 0.85 0.93 update-zipf

0.95 0.93 0.84 0.81 1.71 write-only

0.94 0.94 0.90 0.92 1.14 read-write_range=10

0.95 0.96 0.95 0.95 1.93 read-write_range=100

0.89 0.82 0.80 0.84 1.01 delete

1.05 1.05 1.01 1.10 0.53 insert

Results: update-index

Summary:

Postgres suffers from too much variance
Average throughput is ~1.55X larger for InnoDB than for Postgres
Per operation, Postgres does ~1.20X more write IO (KB written) to storage than InnoDB
Per operation, InnoDB uses more CPU and does more context switches. While autovacuum was enabled and was likely running during the test, my measurements exclude the manual vacuum done at the end of each test.

iostat, vmstat normalized by operation rate

r/s rMB/s w/s wMB/s r/o rKB/o wKB/o o/s dbms

35503.0 373.7 58795.7 1345.1 1.375 14.824 53.351 25817 PG 19b1

33140.6 517.8 53449.6 1735.3 0.827 13.226 44.326 40090 MySQL 8.4.7

cs/s cpu/s cs/o cpu/o dbms

176167 14.4 6.824 .000557 PG 19b1

661395 41.9 16.498 .001046 MySQL 8.4.7

Results: update-zipf

Summary:

Postgres suffers from too much variance
Average throughput is ~1.09X larger for InnoDB than for Postgres
Per operation, Postgres does ~1.30X more write IO (KB written) to storage than InnoDB
Per operation, InnoDB uses more CPU and does more context switches. While autovacuum was enabled and was likely running during the test, my measurements exclude the manual vacuum done at the end of each test.

iostat, vmstat normalized by operation rate

r/s rMB/s w/s wMB/s r/o rKB/o wKB/o o/s dbms

55595.5 620.7 64264.4 1352.3 0.622 7.110 15.490 89396 PG 19b1

27405.9 428.2 37465.1 1133.6 0.282 4.508 11.933 97270 MySQL 8.4.7

cs/s cpu/s cs/o cpu/o dbms

424392 27.2 4.747 .000304 PG 19b1

1213054 44.5 12.471 .000458 MySQL 8.4.7

Results: insert

Summary:

Postgres suffers from too much variance
Average throughput is ~2.06X larger for Postgres than for InnoDB
Per operation, Postgres does ~1.67X more write IO (KB written) to storage than InnoDB
Per operation, Postgres uses more CPU and does more context switches. This is the opposite of what happens above for update-index and update-zipf.

iostat, vmstat normalized by operation rate

r/s rMB/s w/s wMB/s r/o rKB/o wKB/o o/s dbms

1615.5 56.0 15321.7 1170.9 0.007 0.242 5.059 237009 PG 19b1

3.6 0.1 8275.4 340.7 0.000 0.000 3.029 115155 MySQL 8.4.7

cs/s cpu/s cs/o cpu/o dbms

1214563 46.0 10.547 .000399 PG 19b1

800827 50.5 3.379 .000213 MySQL 8.4.7

Tuesday, June 9, 2026

Postgres 19 beta1 vs sysbench on a small server

This has results from sysbench on a small server with Postgres 19 beta1, 18.4 and 17.10. Sysbench is run with low concurrency (1 thread) and a cached database. The purpose is to search for changes in performance, often from new CPU overheads.

tl;dr

19beta1, 18.4 and 17.10 have mostly similar performance
There might be small regressions (about 2%) from 17.10 to 19beta1 but my tests are not good at spotting that.
19beta1 is much faster on one test (read-only-count) thanks to a new query plan

Builds, configuration and hardware

I compiled Postgres from source.

The server is a Beelink SER7 7840HS with an AMD Ryzen 7 7840HS CPU and 32G RAM. Storage uses an NVMe device with ext-4 and discard enabled. The OS is Ubuntu 24.04.

The config files are here for 17.10, 18.4 and 19 beta1.

Benchmark

I used sysbench and my usage is explained here. To save time I only run 32 of the 42 microbenchmarks

and most test only 1 type of SQL statement. Benchmarks are run with the database cached by Postgres.

The read-heavy microbenchmarks run for 600 seconds and the write-heavy for 900 seconds.

The benchmark is run with 1 client, 1 table and 50M rows. The purpose is to search for CPU regressions.

Results

I provide charts below with relative QPS (rQPS). The relative QPS is the following:

(QPS for some version) / (QPS for base version)

Here, base version is Postgres 17.10 and some version is either 18.4 or 19 beta1.

I describe performance changes (changes to relative QPS) in terms of basis points. Performance changes by one basis point when the difference in rQPS is 0.01. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

Results: point queries

Summary:

19beta1 is better than 17.10 by ~3 basis points for most tests
19beta1 is slightly better than 18.4

Relative to Postgres 17.10

col-1 : Postgres 18.4

col-2 : Postgres 19 beta1

col-1 col-2

1.01 1.00 hot-points

0.98 0.97 point-query

1.01 1.03 points-covered-pk

1.00 1.04 points-covered-si

1.00 1.02 points-notcovered-pk

1.00 1.03 points-notcovered-si

0.99 0.99 random-points_range=10

1.00 1.03 random-points_range=100

1.01 1.03 random-points_range=1000

Results: range queries without aggregation

Summary:

19beta1 is worse than 17.10 by ~3 basis points in 4 of 5 tests
19beta1 is better than 17.10 by 5 basis points in the scan test
19beta1 and 18.4 are similar except for the scan test where 19beta1 did better

Relative to Postgres 17.10

col-1 : Postgres 18.4

col-2 : Postgres 19 beta1

col-1 col-2

0.98 0.97 range-covered-pk

0.96 0.96 range-covered-si

0.98 0.98 range-notcovered-pk

0.99 0.99 range-notcovered-si

0.95 1.05 scan

Results: range queries with aggregation

Summary:

19beta1 is worse than than 17.10 on two tests
19beta1 is better than 17.10 on five tests
19beta1 and 17.10 are the same on one test
19beta1 is ~2.5X better than 17.10 on the read-only-count test
19beta1 and 18.4 have similar results except for the read-only-count test

The query for the read-only-count test appears to have a different plan in 19beta1 and that might explain the ~2.5X speedup. In 17.10 and 18.4 it gets Index Scan while in 19beta1 it gets Index Only Scan.

Query plans for the read-only-count test ...

For 17.10

explain SELECT count(c) FROM sbtest1 WHERE id BETWEEN 17704460 AND 17705459

Aggregate (cost=1424.42..1424.43 rows=1 width=8)

-> Index Scan using sbtest1_pkey on sbtest1 (cost=0.56..1421.93 rows=996 width=121)

Index Cond: ((id >= 17704460) AND (id <= 17705459))

For 18.4

explain SELECT count(c) FROM sbtest1 WHERE id BETWEEN 11575278 AND 11576277

Aggregate (cost=1310.09..1310.10 rows=1 width=8)

-> Index Scan using sbtest1_pkey on sbtest1 (cost=0.56..1307.89 rows=882 width=121)

Index Cond: ((id >= 11575278) AND (id <= 11576277))

For 19beta1

explain SELECT count(c) FROM sbtest1 WHERE id BETWEEN 11686801 AND 11687800

Aggregate (cost=32.32..32.33 rows=1 width=8)

-> Index Only Scan using sbtest1_pkey on sbtest1 (cost=0.56..30.13 rows=878 width=0)

Index Cond: ((id >= 11686801) AND (id <= 11687800))

Relative to Postgres 17.10

col-1 : Postgres 18.4

col-2 : Postgres 19 beta1

col-1 col-2

1.04 2.47 read-only-count

1.00 0.99 read-only-distinct

1.02 1.01 read-only-order

0.98 0.97 read-only_range=10

1.00 1.00 read-only_range=100

1.02 1.03 read-only_range=10000

1.09 1.09 read-only-simple

1.01 1.01 read-only-sum

Results: writes

Summary:

19beta1 is worse than 17.10 by 2 to 5 basis points
18.4 is worse than 17.10 by 2 to 3 basis points

Relative to Postgres 17.10

col-1 : Postgres 18.4

col-2 : Postgres 19 beta1

col-1 col-2

0.97 0.97 delete

0.99 0.96 insert

0.98 0.97 read-write_range=10

0.98 0.98 read-write_range=100

0.96 0.95 update-index

0.99 0.97 update-inlist

0.97 0.96 update-nonindex

0.97 0.95 update-one

0.97 0.95 update-zipf

0.98 0.97 write-only

Friday, April 10, 2026

MySQL 9.7.0 vs sysbench on a small server

This has results from sysbench on a small server with MySQL 9.7.0 and 8.4.8. Sysbench is run with low concurrency (1 thread) and a cached database. The purpose is to search for changes in performance, often from new CPU overheads.

I tested MySQL 9.7.0 with and without the hypergraph optimizer enabled. I don't expect it to help much because the queries run here are simple. I hope to learn it doesn't hurt performance in that case.

tl;dr

Throughput improves on two tests with the Hypergraph optimizer in 9.7.0 because they get better query plans.
One read-only test and several write-heavy tests have small regressions from 8.4.8 to 9.7.0. This might be from new CPU overheads but I don't see obvious problems in the flamegraphs.

Builds, configuration and hardware

I compiled MySQL from source for versions \8.4.8 and 9.7.0.

The server is an ASUS ExpertCenter PN53 with AMD Ryzen 7 7735HS, 32G RAM and an m.2 device for the database. More details on it are here. The OS is Ubuntu 24.04 and the database filesystem is ext4 with discard enabled.

The my.cnf files os here for 8.4. I call this the z12a configs and variants of it are used for MySQL 5.6 through 8.4.

For 9.7 I use two configs:

z13a

This is as close as possible to z12a and adds two options to undo changes to the default values for two gtid-related options that arrived in 9.6.

z13b

This is like z13a but then enables the hypergraph optimizer

All DBMS versions use the latin1 character set as explained here.

Benchmark

I used sysbench and my usage is explained here. To save time I only run 32 of the 42 microbenchmarks and most test only 1 type of SQL statement. Benchmarks are run with the database cached by InnoDB.

The tests are run using 1 table with 50M rows. The read-heavy microbenchmarks run for 600 seconds and the write-heavy for 1800 seconds.

Results

I provide tables below with relative QPS. When the relative QPS is > 1 then some version is faster than the base version. When it is < 1 then there might be a regression. The relative QPS (rQPS) is:

(QPS for some version) / (QPS for MySQL 8.4.8)

Results: point queries

I describe performance changes (changes to relative QPS, rQPS) in terms of basis points. Performance changes by one basis point when the difference in rQPS is 0.01. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

This shows the rQPS for MySQL 9.7.0 using both the z13a and z13b configs. It is relative to the throughput from MySQL 8.4.8.

Throughput with MySQL 9.7.0 is similar to 8.4.8 except for point-query where there are regressions as rQPS drops by 5 and 7 basis points. The point-query test uses simple queries that fetch one column from one row by PK. From vmstat metrics the CPU overhead per query for 9.7.0 is ~8% larger than for 8.4.8, with and without the hypergraph optimizer. I don't see anything obvious in the flamegraphs.

z13a z13b

0.99 1.01 hot-points

0.95 0.93 point-query

0.99 1.01 points-covered-pk

1.00 1.01 points-covered-si

0.98 1.00 points-notcovered-pk

0.99 1.01 points-notcovered-si

1.00 1.02 random-points_range=1000

0.99 1.01 random-points_range=100

0.96 1.00 random-points_range=10

Results: range queries without aggregation

I describe performance changes (changes to relative QPS, rQPS) in terms of basis points. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

This shows the rQPS for MySQL 9.7.0 using both the z13a and z13b configs. It is relative to the throughput from MySQL 8.4.8.

Throughput with MySQL 9.7.0 is similar to 8.4.8. I am skeptical there is a regression for the scan test with the z13b config. I suspect that is noise.

z13a z13b

0.99 0.99 range-covered-pk

0.99 0.99 range-covered-si

0.99 0.99 range-notcovered-pk

0.98 0.98 range-notcovered-si

1.00 0.96 scan

Results: range queries with aggregation

I describe performance changes (changes to relative QPS, rQPS) in terms of basis points. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

This shows the rQPS for MySQL 9.7.0 using both the z13a and z13b configs. It is relative to the throughput from MySQL 8.4.8.

There might be small regressions in several tests with rQPS dropping by a few points but I will ignore that for now.
There is a large improvement for the read-only-distinct test with the z13b config. The query for this test is select distinct c from sbtest where id between ? and ? order by c. The reason for the performance improvment is that the hypergraph optimizer chooses a better plan, see here.
There is a large improvement for the read-only test with range=10000. This test uses the read-only version of the classic sysbench transaction (see here). One of the queries it runs is the query used by read-only-distinct. So it benefits from the better plan for that query.

z13a z13b

0.97 0.97 read-only-count

0.98 1.26 read-only-distinct

0.96 0.95 read-only-order

0.99 1.15 read-only_range=10000

0.97 1.00 read-only_range=100

0.96 0.97 read-only_range=10

0.99 0.99 read-only-simple

0.97 0.96 read-only-sum

Results: writes

I describe performance changes (changes to relative QPS, rQPS) in terms of basis points. When rQPS decreases from 0.95 to 0.85 then it changed by 10 basis points.

This shows the rQPS for MySQL 9.7.0 using both the z13a and z13b configs. It is relative to the throughput from MySQL 8.4.8.

There might be several small regressions here. I don't see obvious problems in the flamegraphs.

z13a z13b

0.95 0.92 delete

1.00 1.01 insert

0.97 0.98 read-write_range=100

0.96 0.95 read-write_range=10

0.97 0.96 update-index

0.97 0.92 update-inlist

0.95 0.93 update-nonindex

0.95 0.92 update-one

0.95 0.93 update-zipf

0.97 0.95 write-only

Thursday, April 9, 2026

Sysbench vs MySQL on a small server: another way to view the regressions

This post provides another way to see the performance regressions in MySQL from versions 5.6 to 9.7. It complements what I shared in a recent post. The workload here is cached by InnoDB and my focus is on regressions from new CPU overheads.

The good news is that there are few regressions after 8.0. The bad news is that there were many prior to that and these are unlikely to be undone.

tl;dr

for point queries

there are large regressions from 5.6.51 to 5.7.44, 5.7.44 to 8.0.28 and 8.0.28 to 8.0.45
there are few regressions from 8.0.45 to 8.4.8 to 9.7.0

for range queries without aggregation

there are large regressions from 5.6.51 to 5.7.44 and 5.7.44 to 8.0.28
there are mostly small regressions from 8.0.28 to 8.0.45, but scan has a large regression
there are few regressions from 8.0.45 to 8.4.8 to 9.7.0

for range queries with aggregation

there are large regressions from 5.6.51 to 5.7.44 with two improvements
there are large regressions from 5.7.44 to 8.0.28
there are small regressions from 8.0.28 to 8.0.45
there are few regressions from 8.0.45 to 8.4.8 to 9.7.0

for writes

there are large regressions from 5.6.51 to 5.7.44 and 5.7.44 to 8.0.28
there are small regressions from 8.0.28 to 8.0.45
there are few regressions from 8.0.45 to 8.4.8
there are a few small regressions from 8.4.8 to 9.7.0

Builds, configuration and hardware

I compiled MySQL from source for versions 5.6.51, 5.7.44, 8.0.28, 8.0.45, 8.4.8 and 9.7.0.

The my.cnf files are here for 5.6, 5.7 and 8.4. I call these the z12a configs.

For 9.7 I use the z13a config. It is as close as possible to z12a and adds two options for gtid-related features to undo a default config change that arrived in 9.6.

All DBMS versions use the latin1 character set as explained here.

Benchmark

I used sysbench and my usage is explained here. To save time I only run 32 of the 42 microbenchmarks and most test only 1 type of SQL statement. Benchmarks are run with the database cached by InnoDB.

The tests are run using 1 table with 50M rows. The read-heavy microbenchmarks run for 600 seconds and the write-heavy for 1800 seconds.

Results

(QPS for some version) / (QPS for base version)

Results: point queries

MySQL 5.6.51 gets from 1.18X to 1.61X more QPS than 9.7.0 on point queries. It is easier for me to write about this in terms of relative QPS (rQPS) which is as low as 0.62 for MySQL 9.7.0 vs 5.6.51. I define a basis point to mean a change of 0.01 in rQPS.

Summary:

from 5.6.51 to 9.7.0

the median regression is a drop in rQPS of 27 basis points

from 5.6.51 to 5.7.44

the median regression is a drop in rQPS of 11 basis points

from 5.7.44 to 8.0.28

the median regression is a drop in rQPS of 25 basis points

from 8.0.28 to 8.0.45

7 of 9 tests get more QPS with 8.0.45
2 tests have regressions where rQPS drops by ~6 basis points

from 8.0.45 to 8.4.8

there are few regressions

from 8.4.8 to 9.7.0

there are few regressions

This has (QPS for 9.7.0) / (QPS for 5.6.51) and is followed by tables that show the difference between the latest point release in adjacent versions.

the largest regression is an rQPS drop of 38 basis points for point-query. Compared to most of the other tests in this section, this query does less work in the storage engine which implies the regression is from code above the storage engine.
the smallest regression is an rQPS drop of 15 basis points for random-points_range=1000. The regression for the same query with a shorter range (=10, =100) is larger. That implies, at least for this query, that the regression is for something above the storage engine (optimizer, parser, etc).
the median regression is an rQPS drop of 27 basis points

0.65 hot-points

0.62 point-query

0.72 points-covered-pk

0.78 points-covered-si

0.73 points-notcovered-pk

0.76 points-notcovered-si

0.85 random-points_range=1000

0.73 random-points_range=100

0.66 random-points_range=10

This has: (QPS for 5.7.44) / (QPS for 5.6.51)

the largest regression is an rQPS drop of 14 basis points for hot-points.
the next largest regression is an rQPS drop of 13 basis points for random-points with range=10. The regressions for that query are smaller when a larger range is used =100, =1000 and this implies the problem is above the storage engine.
the median regression is an rQPS drop of 11 basis points

0.86 hot-points

0.90 point-query

0.89 points-covered-pk

0.90 points-covered-si

0.89 points-notcovered-pk

0.88 points-notcovered-si

1.00 random-points_range=1000

0.89 random-points_range=100

0.87 random-points_range=10

This has: (QPS for 8.0.28) / (QPS for 5.7.44)

the largest regression is an rQPS drop of 66 basis points for random-points with range=1000. The regression for that same query with smaller ranges (=10, =100) is smaller. This implies the problem is in the storage engine.
the second largest regression is an rQPS drop of 35 basis points for hot-points
the median regression is an rQPS drop of 25 basis points

0.65 hot-points

0.82 point-query

0.74 points-covered-pk

0.75 points-covered-si

0.76 points-notcovered-pk

0.84 points-notcovered-si

0.34 random-points_range=1000

0.75 random-points_range=100

0.86 random-points_range=10

This has: (QPS for 8.0.45) / (QPS for 8.0.28)

at last, there are many improvements. Some are from a fix for bug 102037 which I found with help from sysbench
the regressions, with rQPS drops by ~6 basis points, are for queries that do less work in the storage engine relative to the other tests in this section

1.20 hot-points

0.93 point-query

1.13 points-covered-pk

1.19 points-covered-si

1.09 points-notcovered-pk

1.04 points-notcovered-si

2.48 random-points_range=1000

1.12 random-points_range=100

0.94 random-points_range=10

This has: (QPS for 8.4.8) / (QPS for 8.0.45)

there are few regressions from 8.0.45 to 8.4.8

0.99 hot-points

0.96 point-query

0.99 points-covered-pk

0.98 points-covered-si

1.00 points-notcovered-pk

0.99 points-notcovered-si

1.00 random-points_range=1000

1.00 random-points_range=100

0.98 random-points_range=10

This has: (QPS for 9.7.0) / (QPS for 8.4.8)

there are few regressions from 8.4.8 to 9.7.0

0.99 hot-points

0.95 point-query

0.99 points-covered-pk

1.00 points-covered-si

0.98 points-notcovered-pk

0.99 points-notcovered-si

1.00 random-points_range=1000

0.99 random-points_range=100

0.96 random-points_range=10

Results: range queries without aggregation

MySQL 5.6.51 gets from 1.35X to 1.52X more QPS than 9.7.0 on range queries without aggregation. It is easier for me to write about this in terms of relative QPS (rQPS) which is as low as 0.66 for MySQL 9.7.0 vs 5.6.51. I define a basis point to mean a change of 0.01 in rQPS.

Summary:

from 5.6.51 to 9.7.0

the median regression is drop in rQPS of 33 basis points

from 5.6.51 to 5.7.44

the median regression is a drop in rQPS of 16 basis points

from 5.7.44 to 8.0.28

the median regression is a drop in rQPS ~10 basis points

from 8.0.28 to 8.0.45

the median regression is a drop in rQPS of 5 basis points

from 8.0.45 to 8.4.8

there are few regressions from 8.0.45 to 8.4.8

from 8.4.8 to 9.7.0

there are few regressions from 8.4.8 to 9.7.0

This has (QPS for 9.7.0) / (QPS for 5.6.51) and is followed by tables that show the difference between the latest point release in adjacent versions.

all tests have large regressions with an rQPS drop that ranges from 26 to 34 basis points
the median regression is an rQPS drop of 33 basis points

0.66 range-covered-pk

0.67 range-covered-si

0.66 range-notcovered-pk

0.74 range-notcovered-si

0.67 scan

This has: (QPS for 5.7.44) / (QPS for 5.6.51)

all tests have large regressions with an rQPS drop that ranges from 12 to 17 basis points
the median regression is an rQPS drop of 16 basis points

0.85 range-covered-pk

0.84 range-covered-si

0.84 range-notcovered-pk

0.88 range-notcovered-si

0.83 scan

This has: (QPS for 8.0.28) / (QPS for 5.7.44)

4 of 5 tests have regressions with an rQPS drop that ranges from 10 to 14 basis points
the median regression is ~10 basis points
rQPS improves for the scan test

0.86 range-covered-pk

0.89 range-covered-si

0.90 range-notcovered-pk

0.90 range-notcovered-si

1.04 scan

This has: (QPS for 8.0.45) / (QPS for 8.0.28)

all tests are slower in 8.0.45 than 8.0.28, but the regression for 3 of 5 is <= 5 basis points
rQPS in the scan test drops by 21 basis points
the median regression is an rQPS drop of 5 basis points

0.96 range-covered-pk

0.95 range-covered-si

0.91 range-notcovered-pk

0.96 range-notcovered-si

0.79 scan

This has: (QPS for 8.4.8) / (QPS for 8.0.45)

there are few regressions from 8.0.45 to 8.4.8

0.95 range-covered-pk

0.95 range-covered-si

0.98 range-notcovered-pk

0.99 range-notcovered-si

0.98 scan

This has: (QPS for 9.7.0) / (QPS for 8.4.8)

there are few regressions from 8.4.8 to 9.7.0

0.99 range-covered-pk

0.99 range-covered-si

0.99 range-notcovered-pk

0.98 range-notcovered-si

1.00 scan

Results: range queries with aggregation

Summary:

from 5.6.51 to 9.7.0 rQPS

the median result is a drop in rQPS of ~30 basis points

from 5.6.51 to 5.7.44

the median result is a drop in rQPS of ~10 basis points

from 5.7.44 to 8.0.28

the median result is a drop in rQPS of ~12 basis points

from 8.0.28 to 8.0.45

the median result is an rQPS drop of 5 basis points

from 8.0.45 to 8.4.8

there are few regressions from 8.0.45 to 8.4.8

from 8.4.8 to 9.7.0

there are few regressions from 8.4.8 to 9.7.0

This has (QPS for 9.7.0) / (QPS for 5.6.51) and is followed by tables that show the difference between the latest point release in adjacent versions.

the median result is a drop in rQPS of ~30 basis points
rQPS for the read-only-distinct test improves by 25 basis point

0.67 read-only-count

1.25 read-only-distinct

0.75 read-only-order

1.02 read-only_range=10000

0.74 read-only_range=100

0.66 read-only_range=10

0.69 read-only-simple

0.66 read-only-sum

This has: (QPS for 5.7.44) / (QPS for 5.6.51)

the median result is an rQPS drop of ~10 basis points
rQPS improves by 45 basis points for read-only-distinct and by 23 basis points for read-only with the largest range (=10000)

0.86 read-only-count

1.45 read-only-distinct

0.93 read-only-order

1.23 read-only_range=10000

0.96 read-only_range=100

0.88 read-only_range=10

0.85 read-only-simple

0.86 read-only-sum

This has: (QPS for 8.0.28) / (QPS for 5.7.44)

the median result is an rQPS drop of ~12 basis points

0.91 read-only-count

0.94 read-only-distinct

0.89 read-only-order

0.86 read-only_range=10000

0.87 read-only_range=100

0.85 read-only_range=10

0.90 read-only-simple

0.87 read-only-sum

This has: (QPS for 8.0.45) / (QPS for 8.0.28)

the median result is an rQPS drop of 5 basis points

0.89 read-only-count

0.95 read-only-distinct

0.95 read-only-order

0.97 read-only_range=10000

0.94 read-only_range=100

0.95 read-only_range=10

0.93 read-only-simple

0.93 read-only-sum

This has: (QPS for 8.4.8) / (QPS for 8.0.45)

there are few regressions from 8.0.45 to 8.4.8

0.99 read-only-count

0.98 read-only-distinct

0.99 read-only-order

1.00 read-only_range=10000

0.98 read-only_range=100

0.97 read-only_range=10

0.97 read-only-simple

0.98 read-only-sum

This has: (QPS for 9.7.0) / (QPS for 8.4.8)

there are few regressions from 8.4.8 to 9.7.0

0.97 read-only-count

0.98 read-only-distinct

0.96 read-only-order

0.99 read-only_range=10000

0.97 read-only_range=100

0.96 read-only_range=10

0.99 read-only-simple

0.97 read-only-sum

Results: writes

Summary:

from 5.6.51 to 9.7.0 rQPS

the median result is a drop in rQPS of ~33 basis points

from 5.6.51 to 5.7.44

the median result is an rQPS drop of ~13 basis points

from 5.7.44 to 8.0.28

the median result is an rQPS drop of ~18 basis points

from 8.0.28 to 8.0.45

the median result is an rQPS drop of 9 basis points

from 8.0.45 to 8.4.8

there are few regressions from 8.0.45 to 8.4.8

from 8.4.8 to 9.7.0

the median result is an rQPS drop of 4 basis points

This has (QPS for 9.7.0) / (QPS for 5.6.51) and is followed by tables that show the difference between the latest point release in adjacent versions.

the median result is an rQPS drop of ~33 basis points

0.56 delete

0.54 insert

0.72 read-write_range=100

0.66 read-write_range=10

0.88 update-index

0.74 update-inlist

0.60 update-nonindex

0.58 update-one

0.60 update-zipf

0.67 write-only

This has: (QPS for 5.7.44) / (QPS for 5.6.51)

the median result is an rQPS drop of ~13 basis points
rQPS improves by 21 basis points for update-index and by 5 basis points for update-inlist

0.82 delete

0.80 insert

0.94 read-write_range=100

0.88 read-write_range=10

1.21 update-index

1.05 update-inlist

0.86 update-nonindex

0.85 update-one

0.86 update-zipf

0.94 write-only

This has: (QPS for 8.0.28) / (QPS for 5.7.44)

the median result is an rQPS drop of ~18 basis points

0.80 delete

0.77 insert

0.87 read-write_range=100

0.85 read-write_range=10

0.94 update-index

0.79 update-inlist

0.81 update-nonindex

0.80 update-one

0.81 update-zipf

0.83 write-only

This has: (QPS for 8.0.45) / (QPS for 8.0.28)

the median result is an rQPS drop of 9 basis points

0.91 delete

0.90 insert

0.94 read-write_range=100

0.94 read-write_range=10

0.80 update-index

0.92 update-inlist

0.91 update-nonindex

0.92 update-one

0.91 update-zipf

0.89 write-only

This has: (QPS for 8.4.8) / (QPS for 8.0.45)

there are few regressions from 8.0.45 to 8.4.8

0.98 delete

0.98 insert

0.98 read-write_range=100

0.98 read-write_range=10

0.99 update-index

0.99 update-inlist

0.99 update-nonindex

0.99 update-one

0.99 update-zipf

0.99 write-only

This has: (QPS for 9.7.0) / (QPS for 8.4.8)

the median result is an rQPS drop of 4 basis points

0.95 delete

1.00 insert

0.97 read-write_range=100

0.96 read-write_range=10

0.97 update-index

0.97 update-inlist

0.95 update-nonindex

0.95 update-one

0.95 update-zipf

0.97 write-only