1. 08 Dec, 2006 2 commits
  2. 07 Dec, 2006 2 commits
  3. 02 Oct, 2006 1 commit
  4. 01 Oct, 2006 2 commits
    • Andi Kleen's avatar
      [PATCH] Support piping into commands in /proc/sys/kernel/core_pattern · d025c9db
      Andi Kleen authored
      
      
      Using the infrastructure created in previous patches implement support to
      pipe core dumps into programs.
      
      This is done by overloading the existing core_pattern sysctl
      with a new syntax:
      
      |program
      
      When the first character of the pattern is a '|' the kernel will instead
      threat the rest of the pattern as a command to run.  The core dump will be
      written to the standard input of that program instead of to a file.
      
      This is useful for having automatic core dump analysis without filling up
      disks.  The program can do some simple analysis and save only a summary of
      the core dump.
      
      The core dump proces will run with the privileges and in the name space of
      the process that caused the core dump.
      
      I also increased the core pattern size to 128 bytes so that longer command
      lines fit.
      
      Most of the changes comes from allowing core dumps without seeks.  They are
      fairly straight forward though.
      
      One small incompatibility is that if someone had a core pattern previously
      that started with '|' they will get suddenly new behaviour.  I think that's
      unlikely to be a real problem though.
      
      Additional background:
      
      > Very nice, do you happen to have a program that can accept this kind of
      > input for crash dumps?  I'm guessing that the embedded people will
      > really want this functionality.
      
      I had a cheesy demo/prototype.  Basically it wrote the dump to a file again,
      ran gdb on it to get a backtrace and wrote the summary to a shared directory.
      Then there was a simple CGI script to generate a "top 10" crashes HTML
      listing.
      
      Unfortunately this still had the disadvantage to needing full disk space for a
      dump except for deleting it afterwards (in fact it was worse because over the
      pipe holes didn't work so if you have a holey address map it would require
      more space).
      
      Fortunately gdb seems to be happy to handle /proc/pid/fd/xxx input pipes as
      cores (at least it worked with zsh's =(cat core) syntax), so it would be
      likely possible to do it without temporary space with a simple wrapper that
      calls it in the right way.  I ran out of time before doing that though.
      
      The demo prototype scripts weren't very good.  If there is really interest I
      can dig them out (they are currently on a laptop disk on the desk with the
      laptop itself being in service), but I would recommend to rewrite them for any
      serious application of this and fix the disk space problem.
      
      Also to be really useful it should probably find a way to automatically fetch
      the debuginfos (I cheated and just installed them in advance).  If nobody else
      does it I can probably do the rewrite myself again at some point.
      
      My hope at some point was that desktops would support it in their builtin
      crash reporters, but at least the KDE people I talked too seemed to be happy
      with their user space only solution.
      
      Alan sayeth:
      
        I don't believe that piping as such as neccessarily the right model, but
        the ability to intercept and processes core dumps from user space is asked
        for by many enterprise users as well.  They want to know about, capture,
        analyse and process core dumps, often centrally and in automated form.
      
      [akpm@osdl.org: loff_t != unsigned long]
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d025c9db
    • Jay Lan's avatar
      [PATCH] csa: convert CONFIG tag for extended accounting routines · 8f0ab514
      Jay Lan authored
      
      
      There were a few accounting data/macros that are used in CSA but are #ifdef'ed
      inside CONFIG_BSD_PROCESS_ACCT.  This patch is to change those ifdef's from
      CONFIG_BSD_PROCESS_ACCT to CONFIG_TASK_XACCT.  A few defines are moved from
      kernel/acct.c and include/linux/acct.h to kernel/tsacct.c and
      include/linux/tsacct_kern.h.
      Signed-off-by: default avatarJay Lan <jlan@sgi.com>
      Cc: Shailabh Nagar <nagar@watson.ibm.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Jes Sorensen <jes@sgi.com>
      Cc: Chris Sturtivant <csturtiv@sgi.com>
      Cc: Tony Ernst <tee@sgi.com>
      Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      8f0ab514
  5. 29 Sep, 2006 1 commit
  6. 27 Sep, 2006 2 commits
  7. 27 Aug, 2006 1 commit
    • Dave Jones's avatar
      [PATCH] fix up lockdep trace in fs/exec.c · 513627d7
      Dave Jones authored
      
      
      This fixes the locking error noticed by lockdep:
      
        =============================================
        [ INFO: possible recursive locking detected ]
        ---------------------------------------------
        init/1 is trying to acquire lock:
         (&sighand->siglock){....}, at: [<c047a78a>] flush_old_exec+0x3ae/0x859
      
        but task is already holding lock:
         (&sighand->siglock){....}, at: [<c047a77a>] flush_old_exec+0x39e/0x859
      
        other info that might help us debug this:
        2 locks held by init/1:
         #0:  (tasklist_lock){..--}, at: [<c047a76a>] flush_old_exec+0x38e/0x859
         #1:  (&sighand->siglock){....}, at: [<c047a77a>] flush_old_exec+0x39e/0x859
      
        stack backtrace:
         [<c04051e1>] show_trace_log_lvl+0x54/0xfd
         [<c040579d>] show_trace+0xd/0x10
         [<c04058b6>] dump_stack+0x19/0x1b
         [<c043b33a>] __lock_acquire+0x773/0x997
         [<c043bacf>] lock_acquire+0x4b/0x6c
         [<c060630b>] _spin_lock+0x19/0x28
         [<c047a78a>] flush_old_exec+0x3ae/0x859
         [<c0498053>] load_elf_binary+0x4aa/0x1628
         [<c0479cab>] search_binary_handler+0xa7/0x24e
         [<c047b577>] do_execve+0x15b/0x1f9
         [<c04022b4>] sys_execve+0x29/0x4d
         [<c0403faf>] syscall_call+0x7/0xb
      Signed-off-by: default avatarArjan van de Ven <arjan@infradead.org>
      Signed-off-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      513627d7
  8. 24 Aug, 2006 2 commits
  9. 30 Jun, 2006 1 commit
  10. 26 Jun, 2006 8 commits
  11. 22 Jun, 2006 1 commit
    • Miklos Szeredi's avatar
      [PATCH] remove steal_locks() · c89681ed
      Miklos Szeredi authored
      
      
      This patch removes the steal_locks() function.
      
      steal_locks() doesn't work correctly with any filesystem that does it's own
      lock management, including NFS, CIFS, etc.
      
      In addition it has weird semantics on local filesystems in case tasks
      sharing file-descriptor tables are doing POSIX locking operations in
      parallel to execve().
      
      The steal_locks() function has an effect on applications doing:
      
      clone(CLONE_FILES)
        /* in child */
        lock
        execve
        lock
      
      POSIX locks acquired before execve (by "child", "parent" or any further
      task sharing files_struct) will after the execve be owned exclusively by
      "child".
      
      According to Chris Wright some LSB/LTP kind of suite triggers without the
      stealing behavior, but there's no known real-world application that would
      also fail.
      
      Apps using NPTL are not affected, since all other threads are killed before
      execve.
      
      Apps using LinuxThreads are only affected if they
      
        - have multiple threads during exec (LinuxThreads doesn't kill other
          threads, the app may do it with pthread_kill_other_threads_np())
        - rely on POSIX locks being inherited across exec
      
      Both conditions are documented, but not their interaction.
      
      Apps using clone() natively are affected if they
      
        - use clone(CLONE_FILES)
        - rely on POSIX locks being inherited across exec
      
      The above scenarios are unlikely, but possible.
      
      If the patch is vetoed, there's a plan B, that involves mostly keeping the
      weird stealing semantics, but changing the way lock ownership is handled so
      that network and local filesystems work consistently.
      
      That would add more complexity though, so this solution seems to be
      preferred by most people.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Matthew Wilcox <willy@debian.org>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Steven French <sfrench@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      c89681ed
  12. 20 Jun, 2006 1 commit
  13. 19 Apr, 2006 1 commit
  14. 14 Apr, 2006 1 commit
    • Eric W. Biederman's avatar
      [PATCH] de_thread: Don't change our parents and ptrace flags. · c06511d1
      Eric W. Biederman authored
      
      
      This is two distinct changes.
       - Not changing our real parents.
       - Not changing our ptrace parents.
      
      Not changing our real parents is trivially correct because both tasks
      have the same real parents as they are part of a thread group.  Now that
      we demote the leader to a thread there is no longer any reason to change
      it's parentage.
      
      Not changing our ptrace parents is a user visible change if someone
      looks hard enough.  I don't think user space applications will care or
      even notice.
      
      In the practical and I think common case a debugger will have attached
      to all of the threads using the same ptrace flags.  From my quick skim
      of strace and gdb that appears to be the case.  Which if true means
      debuggers will not notice a change.
      
      Before this point we have already generated a ptrace event in do_exit
      that reports the leaders pid has died so de_thread is visible to a
      debugger.  Which means attempting to hide this case by copying flags
      around appears excessive.
      
      By not doing anything it avoids all of the weird locking issues between
      de_thread and ptrace attach, and removes one case from consideration for
      fixing the ptrace locking.
      
      This only addresses Oleg's first concern with ptrace_attach, that of the
      problems caused by reparenting.  Oleg's second concern is essentially a
      race between ptrace_attach and release_task that causes an oops when we
      get to force_sig_specific.  There is nothing special about de_thread
      with respect to that race.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      c06511d1
  15. 11 Apr, 2006 1 commit
  16. 10 Apr, 2006 1 commit
    • Eric W. Biederman's avatar
      [PATCH] de_thread: Don't confuse users do_each_thread. · de12a787
      Eric W. Biederman authored
      
      
      Oleg Nesterov spotted two interesting bugs with the current de_thread
      code.  The simplest is a long standing double decrement of
      __get_cpu_var(process_counts) in __unhash_process.  Caused by
      two processes exiting when only one was created.
      
      The other is that since we no longer detach from the thread_group list
      it is possible for do_each_thread when run under the tasklist_lock to
      see the same task_struct twice.  Once on the task list as a
      thread_group_leader, and once on the thread list of another
      thread.
      
      The double appearance in do_each_thread can cause a double increment
      of mm_core_waiters in zap_threads resulting in problems later on in
      coredump_wait.
      
      To remedy those two problems this patch takes the simple approach
      of changing the old thread group leader into a child thread.
      The only routine in release_task that cares is __unhash_process,
      and it can be trivially seen that we handle cleaning up a
      thread group leader properly.
      
      Since de_thread doesn't change the pid of the exiting leader process
      and instead shares it with the new leader process.  I change
      thread_group_leader to recognize group leadership based on the
      group_leader field and not based on pids.  This should also be
      slightly cheaper then the existing thread_group_leader macro.
      
      I performed a quick audit and I couldn't see any user of
      thread_group_leader that cared about the difference.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      de12a787
  17. 31 Mar, 2006 1 commit
  18. 29 Mar, 2006 5 commits
    • Oleg Nesterov's avatar
      [PATCH] convert sighand_cache to use SLAB_DESTROY_BY_RCU · aa1757f9
      Oleg Nesterov authored
      
      
      This patch borrows a clever Hugh's 'struct anon_vma' trick.
      
      Without tasklist_lock held we can't trust task->sighand until we locked it
      and re-checked that it is still the same.
      
      But this means we don't need to defer 'kmem_cache_free(sighand)'.  We can
      return the memory to slab immediately, all we need is to be sure that
      sighand->siglock can't dissapear inside rcu protected section.
      
      To do so we need to initialize ->siglock inside ctor function,
      SLAB_DESTROY_BY_RCU does the rest.
      Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      aa1757f9
    • Oleg Nesterov's avatar
      [PATCH] remove add_parent()'s parent argument · 8fafabd8
      Oleg Nesterov authored
      
      
      add_parent(p, parent) is always called with parent == p->parent, and it makes
      no sense to do it differently.  This patch removes this argument.
      
      No changes in affected .o files.
      Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      8fafabd8
    • Eric W. Biederman's avatar
      [PATCH] pidhash: kill switch_exec_pids · d73d6529
      Eric W. Biederman authored
      
      
      switch_exec_pids is only called from de_thread by way of exec, and it is
      only called when we are exec'ing from a non thread group leader.
      
      Currently switch_exec_pids gives the leader the pid of the thread and
      unhashes and rehashes all of the process groups.  The leader is already in
      the EXIT_DEAD state so no one cares about it's pids.  The only concern for
      the leader is that __unhash_process called from release_task will function
      correctly.  If we don't touch the leader at all we know that
      __unhash_process will work fine so there is no need to touch the leader.
      
      For the task becomming the thread group leader, we just need to give it the
      pid of the old thread group leader, add it to the task list, and attach it
      to the session and the process group of the thread group.
      
      Currently de_thread is also adding the task to the task list which is just
      silly.
      
      Currently the only leader of __detach_pid besides detach_pid is
      switch_exec_pids because of the ugly extra work that was being
      performed.
      
      So this patch removes switch_exec_pids because it is doing too much, it is
      creating an unnecessary special case in pid.c, duing work duplicated in
      de_thread, and generally obscuring what it is going on.
      
      The necessary work is added to de_thread, and it seems to be a little
      clearer there what is going on.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Kirill Korotaev <dev@sw.ru>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d73d6529
    • Oleg Nesterov's avatar
      [PATCH] simplify exec from init's subthread · 1434261c
      Oleg Nesterov authored
      
      
      I think it is enough to take tasklist_lock for reading while changing
      child_reaper:
      
      	Reparenting needs write_lock(tasklist_lock)
      
      	Only one thread in a thread group can do exec()
      
      	sighand->siglock garantees that get_signal_to_deliver()
      	will not see a stale value of child_reaper.
      
      This means that we can change child_reaper earlier, without calling
      zap_other_threads() twice.
      
      "child_reaper = current" is a NOOP when init does exec from main thread, we
      don't care.
      Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      1434261c
    • Eric W. Biederman's avatar
      [PATCH] exec: allow init to exec from any thread. · fef23e7f
      Eric W. Biederman authored
      
      
      After looking at the problem of init calling exec some more I figured out
      an easy way to make the code work.
      
      The actual symptom without out this patch is that all threads will die
      except pid == 1, and the thread calling exec.  The thread calling exec will
      wait forever for pid == 1 to die.
      
      Since pid == 1 does not install a handler for SIGKILL it will never die.
      
      This modifies the tests for init from current->pid == 1 to the equivalent
      current == child_reaper.  And then it causes exec in the ugly case to
      modify child_reaper.
      
      The only weird symptom is that you wind up with an init process that
      doesn't have the oldest start time on the box.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      fef23e7f
  19. 26 Mar, 2006 1 commit
  20. 25 Mar, 2006 2 commits
  21. 01 Mar, 2006 1 commit
  22. 15 Feb, 2006 1 commit
  23. 19 Jan, 2006 1 commit
    • Ulrich Drepper's avatar
      [PATCH] vfs: *at functions: core · 5590ff0d
      Ulrich Drepper authored
      
      
      Here is a series of patches which introduce in total 13 new system calls
      which take a file descriptor/filename pair instead of a single file
      name.  These functions, openat etc, have been discussed on numerous
      occasions.  They are needed to implement race-free filesystem traversal,
      they are necessary to implement a virtual per-thread current working
      directory (think multi-threaded backup software), etc.
      
      We have in glibc today implementations of the interfaces which use the
      /proc/self/fd magic.  But this code is rather expensive.  Here are some
      results (similar to what Jim Meyering posted before).
      
      The test creates a deep directory hierarchy on a tmpfs filesystem.  Then
      rm -fr is used to remove all directories.  Without syscall support I get
      this:
      
      real    0m31.921s
      user    0m0.688s
      sys     0m31.234s
      
      With syscall support the results are much better:
      
      real    0m20.699s
      user    0m0.536s
      sys     0m20.149s
      
      The interfaces are for obvious reasons currently not much used.  But they'll
      be used.  coreutils (and Jeff's posixutils) are already using them.
      Furthermore, code like ftw/fts in libc (maybe even glob) will also start using
      them.  I expect a patch to make follow soon.  Every program which is walking
      the filesystem tree will benefit.
      Signed-off-by: default avatarUlrich Drepper <drepper@redhat.com>
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Michael Kerrisk <mtk-manpages@gmx.net>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5590ff0d