1. 19 Apr, 2015 1 commit
  2. 21 Nov, 2014 1 commit
    • Andrey Vagin's avatar
      ipc: always handle a new value of auto_msgmni · f2f25589
      Andrey Vagin authored
      commit 1195d94e006b23c6292e78857e154872e33b6d7e upstream.
      
      proc_dointvec_minmax() returns zero if a new value has been set.  So we
      don't need to check all charecters have been handled.
      
      Below you can find two examples.  In the new value has not been handled
      properly.
      
      $ strace ./a.out
      open("/proc/sys/kernel/auto_msgmni", O_WRONLY) = 3
      write(3, "0\n\0", 3)                    = 2
      close(3)                                = 0
      exit_group(0)
      $ cat /sys/kernel/debug/tracing/trace
      
      $strace ./a.out
      open("/proc/sys/kernel/auto_msgmni", O_WRONLY) = 3
      write(3, "0\n", 2)                      = 2
      close(3)                                = 0
      
      $ cat /sys/kernel/debug/tracing/trace
      a.out-697   [000] ....  3280.998235: unregister_ipcns_notifier <-proc_ipcauto_dointvec_minmax
      
      Fixes: 9eefe520 ("ipc: do not use a negative value to re-enable msgmni automatic recomputin")
      Signed-off-by: 's avatarAndrey Vagin <avagin@openvz.org>
      Cc: Mathias Krause <minipli@googlemail.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f2f25589
  3. 24 Mar, 2014 1 commit
    • Michael Kerrisk's avatar
      ipc: Fix 2 bugs in msgrcv() MSG_COPY implementation · 0f814126
      Michael Kerrisk authored
      commit 4f87dac386cc43d5525da7a939d4b4e7edbea22c upstream.
      
      While testing and documenting the msgrcv() MSG_COPY flag that Stanislav
      Kinsbursky added in commit 4a674f34 ("ipc: introduce message queue
      copy feature" => kernel 3.8), I discovered a couple of bugs in the
      implementation.  The two bugs concern MSG_COPY interactions with other
      msgrcv() flags, namely:
      
       (A) MSG_COPY + MSG_EXCEPT
       (B) MSG_COPY + !IPC_NOWAIT
      
      The bugs are distinct (and the fix for the first one is obvious),
      however my fix for both is a single-line patch, which is why I'm
      combining them in a single mail, rather than writing two mails+patches.
      
       ===== (A) MSG_COPY + MSG_EXCEPT =====
      
      With the addition of the MSG_COPY flag, there are now two msgrcv()
      flags--MSG_COPY and MSG_EXCEPT--that modify the meaning of the 'msgtyp'
      argument in unrelated ways.  Specifying both in the same call is a
      logical error that is currently permitted, with the effect that MSG_COPY
      has priority and MSG_EXCEPT is ignored.  The call should give an error
      if both flags are specified.  The patch below implements that behavior.
      
       ===== (B) (B) MSG_COPY + !IPC_NOWAIT =====
      
      The test code that was submitted in commit 3a665531 ("selftests: IPC
      message queue copy feature test") shows MSG_COPY being used in
      conjunction with IPC_NOWAIT.  In other words, if there is no message at
      the position 'msgtyp'.  return immediately with the error in ENOMSG.
      
      What was not (fully) tested is the behavior if MSG_COPY is specified
      *without* IPC_NOWAIT, and there is an odd behavior.  If the queue
      contains less than 'msgtyp' messages, then the call blocks until the
      next message is written to the queue.  At that point, the msgrcv() call
      returns a copy of the newly added message, regardless of whether that
      message is at the ordinal position 'msgtyp'.  This is clearly bogus, and
      problematic for applications that might want to make use of the MSG_COPY
      flag.
      
      I considered the following possible solutions to this problem:
      
       (1) Force the call to block until a message *does* appear at the
           position 'msgtyp'.
      
       (2) If the MSG_COPY flag is specified, the kernel should implicitly add
           IPC_NOWAIT, so that the call fails with ENOMSG for this case.
      
       (3) If the MSG_COPY flag is specified, but IPC_NOWAIT is not, generate
           an error (probably, EINVAL is the right one).
      
      I do not know if any application would really want to have the
      functionality of solution (1), especially since an application can
      determine in advance the number of messages in the queue using msgctl()
      IPC_STAT.  Obviously, this solution would be the most work to implement.
      
      Solution (2) would have the effect of silently fixing any applications
      that tried to employ broken behavior.  However, it would mean that if we
      later decided to implement solution (1), then user-space could not
      easily detect what the kernel supports (but, since I'm somewhat doubtful
      that solution (1) is needed, I'm not sure that this is much of a
      problem).
      
      Solution (3) would have the effect of informing broken applications that
      they are doing something broken.  The downside is that this would cause
      a ABI breakage for any applications that are currently employing the
      broken behavior.  However:
      
      a) Those applications are almost certainly not getting the results they
         expect.
      b) Possibly, those applications don't even exist, because MSG_COPY is
         currently hidden behind CONFIG_CHECKPOINT_RESTORE.
      
      The upside of solution (3) is that if we later decided to implement
      solution (1), user-space could determine what the kernel supports, via
      the error return.
      
      In my view, solution (3) is mildly preferable to solution (2), and
      solution (1) could still be done later if anyone really cares.  The
      patch below implements solution (3).
      
      PS.  For anyone out there still listening, it's the usual story:
      documenting an API (and the thinking about, and the testing of the API,
      that documentation entails) is the one of the single best ways of
      finding bugs in the API, as I've learned from a lot of experience.  Best
      to do that documentation before releasing the API.
      Signed-off-by: 's avatarMichael Kerrisk <mtk.manpages@gmail.com>
      Acked-by: 's avatarStanislav Kinsbursky <skinsbursky@parallels.com>
      Cc: Stanislav Kinsbursky <skinsbursky@parallels.com>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0f814126
  4. 07 Mar, 2014 1 commit
  5. 04 Dec, 2013 5 commits
    • Jeff Layton's avatar
      audit: fix mq_open and mq_unlink to add the MQ root as a hidden parent audit_names record · 24dccf86
      Jeff Layton authored
      commit 79f6530cb59e2a0af6953742a33cc29e98ca631c upstream.
      
      The old audit PATH records for mq_open looked like this:
      
        type=PATH msg=audit(1366282323.982:869): item=1 name=(null) inode=6777
        dev=00:0c mode=041777 ouid=0 ogid=0 rdev=00:00
        obj=system_u:object_r:tmpfs_t:s15:c0.c1023
        type=PATH msg=audit(1366282323.982:869): item=0 name="test_mq" inode=26732
        dev=00:0c mode=0100700 ouid=0 ogid=0 rdev=00:00
        obj=staff_u:object_r:user_tmpfs_t:s15:c0.c1023
      
      ...with the audit related changes that went into 3.7, they now look like this:
      
        type=PATH msg=audit(1366282236.776:3606): item=2 name=(null) inode=66655
        dev=00:0c mode=0100700 ouid=0 ogid=0 rdev=00:00
        obj=staff_u:object_r:user_tmpfs_t:s15:c0.c1023
        type=PATH msg=audit(1366282236.776:3606): item=1 name=(null) inode=6926
        dev=00:0c mode=041777 ouid=0 ogid=0 rdev=00:00
        obj=system_u:object_r:tmpfs_t:s15:c0.c1023
        type=PATH msg=audit(1366282236.776:3606): item=0 name="test_mq"
      
      Both of these look wrong to me.  As Steve Grubb pointed out:
      
       "What we need is 1 PATH record that identifies the MQ.  The other PATH
        records probably should not be there."
      
      Fix it to record the mq root as a parent, and flag it such that it
      should be hidden from view when the names are logged, since the root of
      the mq filesystem isn't terribly interesting.  With this change, we get
      a single PATH record that looks more like this:
      
        type=PATH msg=audit(1368021604.836:484): item=0 name="test_mq" inode=16914
        dev=00:0c mode=0100644 ouid=0 ogid=0 rdev=00:00
        obj=unconfined_u:object_r:user_tmpfs_t:s0
      
      In order to do this, a new audit_inode_parent_hidden() function is
      added.  If we do it this way, then we avoid having the existing callers
      of audit_inode needing to do any sort of flag conversion if auditing is
      inactive.
      Signed-off-by: 's avatarJeff Layton <jlayton@redhat.com>
      Reported-by: 's avatarJiri Jaburek <jjaburek@redhat.com>
      Cc: Steve Grubb <sgrubb@redhat.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      24dccf86
    • Manfred Spraul's avatar
      ipc/sem.c: synchronize semop and semctl with IPC_RMID · 873be93b
      Manfred Spraul authored
      commit 6e224f94597842c5eb17f1fc2208d20b6f7f7d49 upstream.
      
      After acquiring the semlock spinlock, operations must test that the
      array is still valid.
      
       - semctl() and exit_sem() would walk stale linked lists (ugly, but
         should be ok: all lists are empty)
      
       - semtimedop() would sleep forever - and if woken up due to a signal -
         access memory after free.
      
      The patch also:
       - standardizes the tests for .deleted, so that all tests in one
         function leave the function with the same approach.
       - unconditionally tests for .deleted immediately after every call to
         sem_lock - even it it means that for semctl(GETALL), .deleted will be
         tested twice.
      
      Both changes make the review simpler: After every sem_lock, there must
      be a test of .deleted, followed by a goto to the cleanup code (if the
      function uses "goto cleanup").
      
      The only exception is semctl_down(): If sem_ids().rwsem is locked, then
      the presence in ids->ipcs_idr is equivalent to !.deleted, thus no
      additional test is required.
      Signed-off-by: 's avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Acked-by: 's avatarDavidlohr Bueso <davidlohr@hp.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      873be93b
    • Davidlohr Bueso's avatar
      ipc: update locking scheme comments · 3f47cff8
      Davidlohr Bueso authored
      commit 18ccee263c7e250a57f01c9434658f11f4118a64 upstream.
      
      The initial documentation was a bit incomplete, update accordingly.
      
      [akpm@linux-foundation.org: make it more readable in 80 columns]
      Signed-off-by: 's avatarDavidlohr Bueso <davidlohr@hp.com>
      Acked-by: 's avatarManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f47cff8
    • Mathias Krause's avatar
      ipc, msg: forbid negative values for "msg{max,mnb,mni}" · a12503f5
      Mathias Krause authored
      commit 9bf76ca325d5e9208eb343f7bd4cc666f703ed30 upstream.
      
      Negative message lengths make no sense -- so don't do negative queue
      lenghts or identifier counts. Prevent them from getting negative.
      
      Also change the underlying data types to be unsigned to avoid hairy
      surprises with sign extensions in cases where those variables get
      evaluated in unsigned expressions with bigger data types, e.g size_t.
      
      In case a user still wants to have "unlimited" sizes she could just use
      INT_MAX instead.
      Signed-off-by: 's avatarMathias Krause <minipli@googlemail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a12503f5
    • Mathias Krause's avatar
      ipc, msg: fix message length check for negative values · 4b825b95
      Mathias Krause authored
      commit 4e9b45a19241354daec281d7a785739829b52359 upstream.
      
      On 64 bit systems the test for negative message sizes is bogus as the
      size, which may be positive when evaluated as a long, will get truncated
      to an int when passed to load_msg().  So a long might very well contain a
      positive value but when truncated to an int it would become negative.
      
      That in combination with a small negative value of msg_ctlmax (which will
      be promoted to an unsigned type for the comparison against msgsz, making
      it a big positive value and therefore make it pass the check) will lead to
      two problems: 1/ The kmalloc() call in alloc_msg() will allocate a too
      small buffer as the addition of alen is effectively a subtraction.  2/ The
      copy_from_user() call in load_msg() will first overflow the buffer with
      userland data and then, when the userland access generates an access
      violation, the fixup handler copy_user_handle_tail() will try to fill the
      remainder with zeros -- roughly 4GB.  That almost instantly results in a
      system crash or reset.
      
        ,-[ Reproducer (needs to be run as root) ]--
        | #include <sys/stat.h>
        | #include <sys/msg.h>
        | #include <unistd.h>
        | #include <fcntl.h>
        |
        | int main(void) {
        |     long msg = 1;
        |     int fd;
        |
        |     fd = open("/proc/sys/kernel/msgmax", O_WRONLY);
        |     write(fd, "-1", 2);
        |     close(fd);
        |
        |     msgsnd(0, &msg, 0xfffffff0, IPC_NOWAIT);
        |
        |     return 0;
        | }
        '---
      
      Fix the issue by preventing msgsz from getting truncated by consistently
      using size_t for the message length.  This way the size checks in
      do_msgsnd() could still be passed with a negative value for msg_ctlmax but
      we would fail on the buffer allocation in that case and error out.
      
      Also change the type of m_ts from int to size_t to avoid similar nastiness
      in other code paths -- it is used in similar constructs, i.e.  signed vs.
      unsigned checks.  It should never become negative under normal
      circumstances, though.
      
      Setting msg_ctlmax to a negative value is an odd configuration and should
      be prevented.  As that might break existing userland, it will be handled
      in a separate commit so it could easily be reverted and reworked without
      reintroducing the above described bug.
      
      Hardening mechanisms for user copy operations would have catched that bug
      early -- e.g.  checking slab object sizes on user copy operations as the
      usercopy feature of the PaX patch does.  Or, for that matter, detect the
      long vs.  int sign change due to truncation, as the size overflow plugin
      of the very same patch does.
      
      [akpm@linux-foundation.org: fix i386 min() warnings]
      Signed-off-by: 's avatarMathias Krause <minipli@googlemail.com>
      Cc: Pax Team <pageexec@freemail.hu>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Brad Spengler <spender@grsecurity.net>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b825b95
  6. 29 Nov, 2013 2 commits
    • Greg Thelen's avatar
      ipc,shm: fix shm_file deletion races · b444df2f
      Greg Thelen authored
      commit a399b29dfbaaaf91162b2dc5a5875dd51bbfa2a1 upstream.
      
      When IPC_RMID races with other shm operations there's potential for
      use-after-free of the shm object's associated file (shm_file).
      
      Here's the race before this patch:
      
        TASK 1                     TASK 2
        ------                     ------
        shm_rmid()
          ipc_lock_object()
                                   shmctl()
                                   shp = shm_obtain_object_check()
      
          shm_destroy()
            shum_unlock()
            fput(shp->shm_file)
                                   ipc_lock_object()
                                   shmem_lock(shp->shm_file)
                                   <OOPS>
      
      The oops is caused because shm_destroy() calls fput() after dropping the
      ipc_lock.  fput() clears the file's f_inode, f_path.dentry, and
      f_path.mnt, which causes various NULL pointer references in task 2.  I
      reliably see the oops in task 2 if with shmlock, shmu
      
      This patch fixes the races by:
      1) set shm_file=NULL in shm_destroy() while holding ipc_object_lock().
      2) modify at risk operations to check shm_file while holding
         ipc_object_lock().
      
      Example workloads, which each trigger oops...
      
      Workload 1:
        while true; do
          id=$(shmget 1 4096)
          shm_rmid $id &
          shmlock $id &
          wait
        done
      
        The oops stack shows accessing NULL f_inode due to racing fput:
          _raw_spin_lock
          shmem_lock
          SyS_shmctl
      
      Workload 2:
        while true; do
          id=$(shmget 1 4096)
          shmat $id 4096 &
          shm_rmid $id &
          wait
        done
      
        The oops stack is similar to workload 1 due to NULL f_inode:
          touch_atime
          shmem_mmap
          shm_mmap
          mmap_region
          do_mmap_pgoff
          do_shmat
          SyS_shmat
      
      Workload 3:
        while true; do
          id=$(shmget 1 4096)
          shmlock $id
          shm_rmid $id &
          shmunlock $id &
          wait
        done
      
        The oops stack shows second fput tripping on an NULL f_inode.  The
        first fput() completed via from shm_destroy(), but a racing thread did
        a get_file() and queued this fput():
          locks_remove_flock
          __fput
          ____fput
          task_work_run
          do_notify_resume
          int_signal
      
      Fixes: c2c737a0461e ("ipc,shm: shorten critical region for shmat")
      Fixes: 2caacaa82a51 ("ipc,shm: shorten critical region for shmctl")
      Signed-off-by: 's avatarGreg Thelen <gthelen@google.com>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b444df2f
    • Jesper Nilsson's avatar
      ipc,shm: correct error return value in shmctl (SHM_UNLOCK) · c6cac65e
      Jesper Nilsson authored
      commit 3a72660b07d86d60457ca32080b1ce8c2b628ee2 upstream.
      
      Commit 2caacaa82a51 ("ipc,shm: shorten critical region for shmctl")
      restructured the ipc shm to shorten critical region, but introduced a
      path where the return value could be -EPERM, even if the operation
      actually was performed.
      
      Before the commit, the err return value was reset by the return value
      from security_shm_shmctl() after the if (!ns_capable(...)) statement.
      
      Now, we still exit the if statement with err set to -EPERM, and in the
      case of SHM_UNLOCK, it is not reset at all, and used as the return value
      from shmctl.
      
      To fix this, we only set err when errors occur, leaving the fallthrough
      case alone.
      Signed-off-by: 's avatarJesper Nilsson <jesper.nilsson@axis.com>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c6cac65e
  7. 18 Oct, 2013 29 commits