如何提升MySQL的扩展性以满足BenchmarkSQL TPC-C测试?

MySQL 5.7的当前状态

MySQL 5.7在可扩展性方面并不理想。以下图表说明了在特定配置下,MySQL 5.7.39的TPC-C吞吐量与并发性之间的关系。这包括将事务隔离级别设置为Read Committed并调整innodb_spin_wait_delay参数以减轻吞吐量下降。

图1:BenchmarkSQL测试中MySQL 5.7.39的可扩展性问题

从图表中可以看出,可扩展性问题显著限制了MySQL吞吐量的增加。例如,在100个并发之后,吞吐量开始下降。

为了解决上述性能崩溃问题,使用了Percona的线程池。以下图表说明了在配置Percona线程池后,TPC-C吞吐量与并发性之间的关系。

图2:Percona线程池缓解了MySQL 5.7.39的可扩展性问题

尽管线程池引入了一些开销,并且峰值性能有所下降,但它缓解了高并发下的性能崩溃问题。

MySQL 8.0的当前状态

让我们看看MySQL 8.0在可扩展性方面所做的努力。

重做日志优化

第一个主要改进是重做日志优化[3]。

C++

 

commit 6be2fa0bdbbadc52cc8478b52b69db02b0eaff40
Author: Paweł Olchawa <[email protected]>
Date:   Wed Feb 14 09:33:42 2018 +0100
    
    WL#10310 Redo log optimization: dedicated threads and concurrent log buffer.

    0. Log buffer became a ring buffer, data inside is no longer shifted.
    1. User threads are able to write concurrently to log buffer. 
    2. Relaxed order of dirty pages in flush lists - no need to synchronize
       the order in which dirty pages are added to flush lists.
    3. Concurrent MTR commits can interleave on different stages of commits.
    4. Introduced dedicated log threads which keep writing log buffer: 
        * log_writer: writes log buffer to system buffers,
        * log_flusher: flushes system buffers to disk.
       As soon as they finished writing (flushing) and there is new data to 
       write (flush), they start next write (flush).
    5. User threads no longer write / flush log buffer to disk, they only
       wait by spinning or on event for notification. They do not have to 
       compete for the responsibility of writing / flushing.
    6. Introduced a ring buffer of events (one per log-block) which are used
       by user threads to wait for written/flushed redo log to avoid:
        * contention on single event
        * false wake-ups of all waiting threads whenever some write/flush
          has finished (we can wake-up only those waiting in related blocks)
    7. Introduced dedicated notifier threads not to delay next writes/fsyncs:
        * log_write_notifier: notifies user threads about written redo,
        * log_flush_notifier: notifies user threads about flushed redo.
    8. Master thread no longer has to flush log buffer.
    ...
    30. Mysql test runner received a new feature (thanks to Marcin):
        --exec_in_background.
    Review: RB#15134
    Reviewers:
        - Marcin Babij <[email protected]>,
        - Debarun Banerjee <[email protected]>.
    Performance tests:
        - Dimitri Kravtchuk <[email protected]>,
        - Daniel Blanchard <[email protected]>,
        - Amrendra Kumar <[email protected]>.
    QA and MTR tests:
        - Vinay Fisrekar <[email protected]>.

进行了一项测试,比较了优化前后不同并发级别下的TPC-C吞吐量。具体的细节显示在以下图表中:

图3:不同并发级别下重做日志优化的影响

图中的结果显示,在低并发级别下,吞吐量有显著提高。

通过锁定系统优化锁

第二个主要的改进是锁定系统优化 [5]。

C++

 

commit 1d259b87a63defa814e19a7534380cb43ee23c48
Author: Jakub Łopuszański <[email protected]>
Date:   Wed Feb 5 14:12:22 2020 +0100
    
    WL#10314 - InnoDB: Lock-sys optimization: sharded lock_sys mutex

    The Lock-sys orchestrates access to tables and rows. Each table, and each row,
    can be thought of as a resource, and a transaction may request access right for
    a resource. As two transactions operating on a single resource can lead to
    problems if the two operations conflict with each other, Lock-sys remembers
    lists of already GRANTED lock requests and checks new requests for conflicts in
    which case they have to start WAITING for their turn.
    
    Lock-sys stores both GRANTED and WAITING lock requests in lists known as queues.
    To allow concurrent operations on these queues, we need a mechanism to latch
    these queues in safe and quick fashion.
    
    In the past a single latch protected access to all of these queues.
    This scaled poorly, and the managment of queues become a bottleneck.
    In this WL, we introduce a more granular approach to latching.
    
    Reviewed-by: Pawel Olchawa <[email protected]>
    Reviewed-by: Debarun Banerjee <[email protected]>
      RB:23836

基于优化锁定系统前后的程序,使用BenchmarkSQL比较不同并发级别下的TPC-C吞吐量,具体结果如下所示:

图4:不同并发级别下锁定系统优化的影响

从图中可以看出,在高并发条件下,优化锁定系统显著提高了吞吐量,而在低并发条件下,由于冲突较少,效果不那么明显。

为trx-sys实现闩锁分片

第三个主要的改进是为trx-sys实现闩锁分片。

C++

 

commit bc95476c0156070fd5cedcfd354fa68ce3c95bdb
Author: Paweł Olchawa <[email protected]>
Date:   Tue May 25 18:12:20 2021 +0200
    
    BUG#32832196 SINGLE RW_TRX_SET LEADS TO CONTENTION ON TRX_SYS MUTEX
    
    1. Introduced shards, each with rw_trx_set and dedicated mutex.
    2. Extracted modifications to rw_trx_set outside its original critical sections
       (removal had to be extracted outside trx_erase_lists).
    3. Eliminated allocation on heap inside TrxUndoRsegs.
    4. [BUG-FIX] The trx->state and trx->start_time became converted to std::atomic<>
       fields to avoid risk of torn reads on egzotic platforms.
    5. Added assertions which ensure that thread operating on transaction has rights
       to do so (to show there is no possible race condition).
    
    RB: 26314
    Reviewed-by: Jakub Łopuszański [email protected]

基于这些优化前后的测试,使用BenchmarkSQL比较不同并发级别下的TPC-C吞吐量,具体结果如下所示:

图5:不同并发级别下trx-sys中闩锁分片的影响

从图中可以看出,这一改进显著提高了TPC-C吞吐量,在200个并发时达到峰值。值得注意的是,在300个并发时,影响减弱,这主要是由于与MVCC ReadView相关的trx-sys子系统存在的可扩展性问题。

精炼MySQL 8.0

其他的改进是我们的独立增强。

MVCC读视图改进

第一个主要的改进是对MVCC读视图数据结构[1]的增强。

进行了性能比较测试,以评估MVCC读视图优化的有效性。下图显示了在修改MVCC读视图数据结构前后,不同并发级别下的TPC-C吞吐量比较。

图6:采用新的混合数据结构前后NUMA性能比较

从图中可以看出,这种变换主要优化了可扩展性,并在NUMA环境中提高了MySQL的峰值吞吐量。

避免双重锁问题

我们做的第二个主要改进是解决双重锁问题,这里的“双重锁”是指view_openview_close both requiring the global trx-sys latch [1]。

使用优化后的MVCC读视图,比较修改前后的TPC-C吞吐量。具体细节如下图中显示:

图7:消除双重锁瓶颈后的性能提升

从图中可以看出,在高并发条件下,这些修改显著提高了可扩展性。

事务节流机制

最后的改进是实现了一个事务节流机制,以防止在极端并发下性能崩溃[1] [2] [4]。

以下图示了在实施事务限流后进行的TPC-C可扩展性压力测试。测试在禁用NUMA BIOS的场景下进行,限制最多512个用户线程进入事务系统。

图8:实施事务限流机制后的BenchmarkSQL最大TPC-C吞吐量

从图中可以看出,实施事务限流机制显著提高了MySQL的可扩展性。

总结

总体而言,在低冲突的BenchmarkSQL TPC-C测试场景中,MySQL完全能够在数万个并发连接下保持性能不崩溃。

参考文献

  1. 王斌(2024年)。软件工程中的问题解决艺术:如何使MySQL更佳。
  2. 新的MySQL线程池
  3. Paweł Olchawa。2018年。《MySQL 8.0:新的无锁、可扩展的WAL设计》。MySQL博客存档
  4. 于祥曜。一千个核心的并发控制评估。博士论文,麻省理工学院,2015年。
  5. MySQL 8.0参考手册

Source:
https://dzone.com/articles/mysql-scalability-improvement-for-benchmarksql