Skip to content
  • Davidlohr Bueso's avatar
    rwsem: check counter to avoid cmpxchg calls · 9607a85b
    Davidlohr Bueso authored
    This patch tries to reduce the amount of cmpxchg calls in the writer
    failed path by checking the counter value first before issuing the
    instruction.  If ->count is not set to RWSEM_WAITING_BIAS then there is
    no point wasting a cmpxchg call.
    
    Furthermore, Michel states "I suppose it helps due to the case where
    someone else steals the lock while we're trying to acquire
    sem->wait_lock."
    
    Two very different workloads and machines were used to see how this
    patch improves throughput: pgbench on a quad-core laptop and aim7 on a
    large 8 socket box with 80 cores.
    
    Some results comparing Michel's fast-path write lock stealing
    (tps-rwsem) on a quad-core laptop running pgbench:
    
      | db_size | clients  |  tps-rwsem     |   tps-patch  |
      +---------+----------+----------------+--------------+
      | 160 MB   |       1 |           6906 |         9153 | + 32.5
      | 160 MB   |       2 |          15931 |        22487 | + 41.1%
      | 160 MB   |       4 |          33021 |        32503 |
      | 160 MB   |       8 |          34626 |        34695 |
      | 160 MB   |      16 |          33098 |        34003 |
      | 160 MB   |      20 |          31343 |        31440 |
      | 160 MB   |      30 |          28961 |        28987 |
      | 160 MB   |      40 |          26902 |        26970 |
      | 160 MB   |      50 |          25760 |        25810 |
      ------------------------------------------------------
      | 1.6 GB   |       1 |           7729 |         7537 |
      | 1.6 GB   |       2 |          19009 |        23508 | + 23.7%
      | 1.6 GB   |       4 |          33185 |        32666 |
      | 1.6 GB   |       8 |          34550 |        34318 |
      | 1.6 GB   |      16 |          33079 |        32689 |
      | 1.6 GB   |      20 |          31494 |        31702 |
      | 1.6 GB   |      30 |          28535 |        28755 |
      | 1.6 GB   |      40 |          27054 |        27017 |
      | 1.6 GB   |      50 |          25591 |        25560 |
      ------------------------------------------------------
      | 7.6 GB   |       1 |           6224 |         7469 | + 20.0%
      | 7.6 GB   |       2 |          13611 |        12778 |
      | 7.6 GB   |       4 |          33108 |        32927 |
      | 7.6 GB   |       8 |          34712 |        34878 |
      | 7.6 GB   |      16 |          32895 |        33003 |
      | 7.6 GB   |      20 |          31689 |        31974 |
      | 7.6 GB   |      30 |          29003 |        28806 |
      | 7.6 GB   |      40 |          26683 |        26976 |
      | 7.6 GB   |      50 |          25925 |        25652 |
      ------------------------------------------------------
    
    For the aim7 worloads, they overall improved on top of Michel's
    patchset.  For full graphs on how the rwsem series plus this patch
    behaves on a large 8 socket machine against a vanilla kernel:
    
      http://stgolabs.net/rwsem-aim7-results.tar.gz
    
    
    
    Signed-off-by: default avatarDavidlohr Bueso <davidlohr.bueso@hp.com>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    9607a85b