Skip to content
  • David Teigland's avatar
    dlm: fix deadlock between dlm_send and dlm_controld · 36b71a8b
    David Teigland authored
    
    
    A deadlock sometimes occurs between dlm_controld closing
    a lowcomms connection through configfs and dlm_send looking
    up the address for a new connection in configfs.
    
    dlm_controld does a configfs rmdir which calls
    dlm_lowcomms_close which waits for dlm_send to
    cancel work on the workqueues.
    
    The dlm_send workqueue thread has called
    tcp_connect_to_sock which calls dlm_nodeid_to_addr
    which does a configfs lookup and blocks on a lock
    held by dlm_controld in the rmdir path.
    
    The solution here is to save the node addresses within
    the lowcomms code so that the lowcomms workqueue does
    not need to step through configfs to get a node address.
    
    dlm_controld:
    wait_for_completion+0x1d/0x20
    __cancel_work_timer+0x1b3/0x1e0
    cancel_work_sync+0x10/0x20
    dlm_lowcomms_close+0x4c/0xb0 [dlm]
    drop_comm+0x22/0x60 [dlm]
    client_drop_item+0x26/0x50 [configfs]
    configfs_rmdir+0x180/0x230 [configfs]
    vfs_rmdir+0xbd/0xf0
    do_rmdir+0x103/0x120
    sys_rmdir+0x16/0x20
    
    dlm_send:
    mutex_lock+0x2b/0x50
    get_comm+0x34/0x140 [dlm]
    dlm_nodeid_to_addr+0x18/0xd0 [dlm]
    tcp_connect_to_sock+0xf4/0x2d0 [dlm]
    process_send_sockets+0x1d2/0x260 [dlm]
    worker_thread+0x170/0x2a0
    
    Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
    36b71a8b