Skip to content
  • Josh Cartwright's avatar
    jffs2: Fix lock acquisition order bug in gc path · 226bb7df
    Josh Cartwright authored
    The locking policy is such that the erase_complete_block spinlock is
    nested within the alloc_sem mutex.  This fixes a case in which the
    acquisition order was erroneously reversed.  This issue was caught by
    the following lockdep splat:
    
       =======================================================
       [ INFO: possible circular locking dependency detected ]
       3.0.5 #1
       -------------------------------------------------------
       jffs2_gcd_mtd6/299 is trying to acquire lock:
        (&c->alloc_sem){+.+.+.}, at: [<c01f7714>] jffs2_garbage_collect_pass+0x314/0x890
    
       but task is already holding lock:
        (&(&c->erase_completion_lock)->rlock){+.+...}, at: [<c01f7708>] jffs2_garbage_collect_pass+0x308/0x890
    
       which lock already depends on the new lock.
    
       the existing dependency chain (in reverse order) is:
    
       -> #1 (&(&c->erase_completion_lock)->rlock){+.+...}:
              [<c008bec4>] validate_chain+0xe6c/0x10bc
              [<c008c660>] __lock_acquire+0x54c/0xba4
              [<c008d240>] lock_acquire+0xa4/0x114
              [<c046780c>] _raw_spin_lock+0x3c/0x4c
              [<c01f744c>] jffs2_garbage_collect_pass+0x4c/0x890
              [<c01f937c>] jffs2_garbage_collect_thread+0x1b4/0x1cc
              [<c0071a68>] kthread+0x98/0xa0
              [<c000f264>] kernel_thread_exit+0x0/0x8
    
       -> #0 (&c->alloc_sem){+.+.+.}:
              [<c008ad2c>] print_circular_bug+0x70/0x2c4
              [<c008c08c>] validate_chain+0x1034/0x10bc
              [<c008c660>] __lock_acquire+0x54c/0xba4
              [<c008d240>] lock_acquire+0xa4/0x114
              [<c0466628>] mutex_lock_nested+0x74/0x33c
              [<c01f7714>] jffs2_garbage_collect_pass+0x314/0x890
              [<c01f937c>] jffs2_garbage_collect_thread+0x1b4/0x1cc
              [<c0071a68>] kthread+0x98/0xa0
              [<c000f264>] kernel_thread_exit+0x0/0x8
    
       other info that might help us debug this:
    
        Possible unsafe locking scenario:
    
              CPU0                    CPU1
              ----                    ----
         lock(&(&c->erase_completion_lock)->rlock);
                                      lock(&c->alloc_sem);
                                      lock(&(&c->erase_completion_lock)->rlock);
         lock(&c->alloc_sem);
    
        *** DEADLOCK ***
    
       1 lock held by jffs2_gcd_mtd6/299:
        #0:  (&(&c->erase_completion_lock)->rlock){+.+...}, at: [<c01f7708>] jffs2_garbage_collect_pass+0x308/0x890
    
       stack backtrace:
       [<c00155dc>] (unwind_backtrace+0x0/0x100) from [<c0463dc0>] (dump_stack+0x20/0x24)
       [<c0463dc0>] (dump_stack+0x20/0x24) from [<c008ae84>] (print_circular_bug+0x1c8/0x2c4)
       [<c008ae84>] (print_circular_bug+0x1c8/0x2c4) from [<c008c08c>] (validate_chain+0x1034/0x10bc)
       [<c008c08c>] (validate_chain+0x1034/0x10bc) from [<c008c660>] (__lock_acquire+0x54c/0xba4)
       [<c008c660>] (__lock_acquire+0x54c/0xba4) from [<c008d240>] (lock_acquire+0xa4/0x114)
       [<c008d240>] (lock_acquire+0xa4/0x114) from [<c0466628>] (mutex_lock_nested+0x74/0x33c)
       [<c0466628>] (mutex_lock_nested+0x74/0x33c) from [<c01f7714>] (jffs2_garbage_collect_pass+0x314/0x890)
       [<c01f7714>] (jffs2_garbage_collect_pass+0x314/0x890) from [<c01f937c>] (jffs2_garbage_collect_thread+0x1b4/0x1cc)
       [<c01f937c>] (jffs2_garbage_collect_thread+0x1b4/0x1cc) from [<c0071a68>] (kthread+0x98/0xa0)
       [<c0071a68>] (kthread+0x98/0xa0) from [<c000f264>] (kernel_thread_exit+0x0/0x8)
    
    This was introduce in '81cfc9f1
    
     jffs2: Fix serious write stall due to erase'.
    
    Cc: stable@kernel.org [2.6.37+]
    Signed-off-by: default avatarJosh Cartwright <joshc@linux.com>
    Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
    Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
    226bb7df