Skip to content
  • Mel Gorman's avatar
    mm: numa: return the number of base pages altered by protection changes · f99510dc
    Mel Gorman authored
    
    
    commit 72403b4a0fbdf433c1fe0127e49864658f6f6468 upstream.
    
    Commit 0255d4918480 ("mm: Account for a THP NUMA hinting update as one
    PTE update") was added to account for the number of PTE updates when
    marking pages prot_numa.  task_numa_work was using the old return value
    to track how much address space had been updated.  Altering the return
    value causes the scanner to do more work than it is configured or
    documented to in a single unit of work.
    
    This patch reverts that commit and accounts for the number of THP
    updates separately in vmstat.  It is up to the administrator to
    interpret the pair of values correctly.  This is a straight-forward
    operation and likely to only be of interest when actively debugging NUMA
    balancing problems.
    
    The impact of this patch is that the NUMA PTE scanner will scan slower
    when THP is enabled and workloads may converge slower as a result.  On
    the flip size system CPU usage should be lower than recent tests
    reported.  This is an illustrative example of a short single JVM specjbb
    test
    
    specjbb
                           3.12.0                3.12.0
                          vanilla      acctupdates
    TPut 1      26143.00 (  0.00%)     25747.00 ( -1.51%)
    TPut 7     185257.00 (  0.00%)    183202.00 ( -1.11%)
    TPut 13    329760.00 (  0.00%)    346577.00 (  5.10%)
    TPut 19    442502.00 (  0.00%)    460146.00 (  3.99%)
    TPut 25    540634.00 (  0.00%)    549053.00 (  1.56%)
    TPut 31    512098.00 (  0.00%)    519611.00 (  1.47%)
    TPut 37    461276.00 (  0.00%)    474973.00 (  2.97%)
    TPut 43    403089.00 (  0.00%)    414172.00 (  2.75%)
    
                  3.12.0      3.12.0
                 vanillaacctupdates
    User         5169.64     5184.14
    System        100.45       80.02
    Elapsed       252.75      251.85
    
    Performance is similar but note the reduction in system CPU time.  While
    this showed a performance gain, it will not be universal but at least
    it'll be behaving as documented.  The vmstats are obviously different but
    here is an obvious interpretation of them from mmtests.
    
                                    3.12.0      3.12.0
                                   vanillaacctupdates
    NUMA page range updates        1408326    11043064
    NUMA huge PMD updates                0       21040
    NUMA PTE updates               1408326      291624
    
    "NUMA page range updates" == nr_pte_updates and is the value returned to
    the NUMA pte scanner.  NUMA huge PMD updates were the number of THP
    updates which in combination can be used to calculate how many ptes were
    updated from userspace.
    
    Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
    Reported-by: default avatarAlex Thorlton <athorlton@sgi.com>
    Reviewed-by: default avatarRik van Riel <riel@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    f99510dc