1. 23 Jun, 2006 2 commits
    • David Howells's avatar
      [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry · 726c3342
      David Howells authored
      
      
      Give the statfs superblock operation a dentry pointer rather than a superblock
      pointer.
      
      This complements the get_sb() patch.  That reduced the significance of
      sb->s_root, allowing NFS to place a fake root there.  However, NFS does
      require a dentry to use as a target for the statfs operation.  This permits
      the root in the vfsmount to be used instead.
      
      linux/mount.h has been added where necessary to make allyesconfig build
      successfully.
      
      Interest has also been expressed for use with the FUSE and XFS filesystems.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      726c3342
    • David Howells's avatar
      [PATCH] VFS: Permit filesystem to override root dentry on mount · 454e2398
      David Howells authored
      
      
      Extend the get_sb() filesystem operation to take an extra argument that
      permits the VFS to pass in the target vfsmount that defines the mountpoint.
      
      The filesystem is then required to manually set the superblock and root dentry
      pointers.  For most filesystems, this should be done with simple_set_mnt()
      which will set the superblock pointer and then set the root dentry to the
      superblock's s_root (as per the old default behaviour).
      
      The get_sb() op now returns an integer as there's now no need to return the
      superblock pointer.
      
      This patch permits a superblock to be implicitly shared amongst several mount
      points, such as can be done with NFS to avoid potential inode aliasing.  In
      such a case, simple_set_mnt() would not be called, and instead the mnt_root
      and mnt_sb would be set directly.
      
      The patch also makes the following changes:
      
       (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
           pointer argument and return an integer, so most filesystems have to change
           very little.
      
       (*) If one of the convenience function is not used, then get_sb() should
           normally call simple_set_mnt() to instantiate the vfsmount. This will
           always return 0, and so can be tail-called from get_sb().
      
       (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
           dcache upon superblock destruction rather than shrink_dcache_anon().
      
           This is required because the superblock may now have multiple trees that
           aren't actually bound to s_root, but that still need to be cleaned up. The
           currently called functions assume that the whole tree is rooted at s_root,
           and that anonymous dentries are not the roots of trees which results in
           dentries being left unculled.
      
           However, with the way NFS superblock sharing are currently set to be
           implemented, these assumptions are violated: the root of the filesystem is
           simply a dummy dentry and inode (the real inode for '/' may well be
           inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
           with child trees.
      
           [*] Anonymous until discovered from another tree.
      
       (*) The documentation has been adjusted, including the additional bit of
           changing ext2_* into foo_* in the documentation.
      
      [akpm@osdl.org: convert ipath_fs, do other stuff]
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      454e2398
  2. 26 Apr, 2006 2 commits
  3. 11 Apr, 2006 6 commits
    • Miklos Szeredi's avatar
      [fuse] fix deadlock between fuse_put_super() and request_end() · 73ce8355
      Miklos Szeredi authored
      
      
      A deadlock was possible, when the last reference to the superblock was
      held due to a background request containing a file reference.
      
      Releasing the file would release the vfsmount which in turn would
      release the superblock.  Since sbput_sem is held during the fput() and
      fuse_put_super() tries to acquire this same semaphore, a deadlock
      results.
      
      The chosen soltuion is to get rid of sbput_sem, and instead use the
      spinlock to ensure the referenced inodes/file are released only once.
      Since the actual release may sleep, defer these outside the locked
      region, but using local variables instead of the structure members.
      
      This is a much more rubust solution.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      73ce8355
    • Miklos Szeredi's avatar
      [PATCH] fuse: account background requests · 08a53cdc
      Miklos Szeredi authored
      
      
      The previous patch removed limiting the number of outstanding requests.  This
      patch adds a much simpler limiting, that is also compatible with file locking
      operations.
      
      A task may have at most one synchronous request allocated.  So these requests
      need not be otherwise limited.
      
      However the number of background requests (release, forget, asynchronous
      reads, interrupted requests) can grow indefinitely.  This can be used by a
      malicous user to cause FUSE to allocate arbitrary amounts of unswappable
      kernel memory, denying service.
      
      For this reason add a limit for the number of background requests, and block
      allocations of new requests until the number goes bellow the limit.
      
      Also use this mechanism to block all requests until the INIT reply is
      received.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      08a53cdc
    • Miklos Szeredi's avatar
      [PATCH] fuse: clean up request accounting · ce1d5a49
      Miklos Szeredi authored
      
      
      FUSE allocated most requests from a fixed size pool filled at mount time.
      However in some cases (release/forget) non-pool requests were used.  File
      locking operations aren't well served by the request pool, since they may
      block indefinetly thus exhausting the pool.
      
      This patch removes the request pool and always allocates requests on demand.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ce1d5a49
    • Miklos Szeredi's avatar
      [PATCH] fuse: use a per-mount spinlock · d7133114
      Miklos Szeredi authored
      
      
      Remove the global spinlock in favor of a per-mount one.
      
      This patch is basically find & replace.  The difficult part has already been
      done by the previous patch.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d7133114
    • Miklos Szeredi's avatar
      [PATCH] fuse: simplify locking · 0720b315
      Miklos Szeredi authored
      
      
      This is in preparation for removing the global spinlock in favor of a
      per-mount one.
      
      The only critical part is the interaction between fuse_dev_release() and
      fuse_fill_super(): fuse_dev_release() must see the assignment to
      file->private_data, otherwise it will leak the reference to fuse_conn.
      
      This is ensured by the fput() operation, which will synchronize the assignment
      with other CPU's that may do a final fput() soon after this.
      
      Also redundant locking is removed from fuse_fill_super(), where exclusion is
      already ensured by the BKL held for this function by the VFS.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      0720b315
    • Jeff Dike's avatar
      [PATCH] fuse: add O_ASYNC support to FUSE device · 385a17bf
      Jeff Dike authored
      
      
      This adds asynchronous notification to FUSE - a FUSE server can request
      O_ASYNC on a /dev/fuse file descriptor and receive SIGIO when there is input
      available.
      
      One subtlety - fuse_dev_fasync, which is called when O_ASYNC is requested,
      does no locking, unlink the other methods.  I think it's unnecessary, as the
      fuse_conn.fasync list is manipulated only by fasync_helper and kill_fasync,
      which provide their own locking.  It would also be wrong to use the fuse_lock,
      as it's a spin lock and fasync_helper can sleep.  My one concern with this is
      the fuse_conn going away underneath fuse_dev_fasync - sys_fcntl takes a
      reference on the file struct, so this seems not to be a problem.
      Signed-off-by: default avatarJeff Dike <jdike@addtoit.com>
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      385a17bf
  4. 01 Feb, 2006 1 commit
    • Miklos Szeredi's avatar
      [PATCH] fuse: fix async read for legacy filesystems · 9cd68455
      Miklos Szeredi authored
      
      
      While asynchronous reads mean a performance improvement in most cases, if
      the filesystem assumed that reads are synchronous, then async reads may
      degrade performance (filesystem may receive reads out of order, which can
      confuse it's own readahead logic).
      
      With sshfs a 1.5 to 4 times slowdown can be measured.
      
      There's also a need for userspace filesystems to know whether asynchronous
      reads are supported by the kernel or not.
      
      To achive these, negotiate in the INIT request whether async reads will be
      used and the maximum readahead value.  Update interface version to 7.6
      
      If userspace uses a version earlier than 7.6, then disable async reads, and
      set maximum readahead value to the maximum read size, as done in previous
      versions.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9cd68455
  5. 17 Jan, 2006 8 commits
  6. 06 Jan, 2006 4 commits
  7. 09 Sep, 2005 13 commits
    • Miklos Szeredi's avatar
      [PATCH] FUSE: don't allow restarting of system calls · 7c352bdf
      Miklos Szeredi authored
      
      
      This patch removes ability to interrupt and restart operations while there
      hasn't been any side-effect.
      
      The reason: applications.  There are some apps it seems that generate
      signals at a fast rate.  This means, that if the operation cannot make
      enough progress between two signals, it will be restarted for ever.  This
      bug actually manifested itself with 'krusader' trying to open a file for
      writing under sshfs.  Thanks to Eduard Czimbalmos for the report.
      
      The problem can be solved just by making open() uninterruptible, because in
      this case it was the truncate operation that slowed down the progress.  But
      it's better to solve this by simply not allowing interrupts at all (except
      SIGKILL), because applications don't expect file operations to be
      interruptible anyway.  As an added bonus the code is simplified somewhat.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      7c352bdf
    • Miklos Szeredi's avatar
      [PATCH] fuse: don't update file times · b36c31ba
      Miklos Szeredi authored
      
      
      Don't change mtime/ctime/atime to local time on read/write.  Rather invalidate
      file attributes, so next stat() will force a GETATTR call.  Bug reported by
      Ben Grimm.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b36c31ba
    • Miklos Szeredi's avatar
      [PATCH] fuse: more flexible caching · 45323fb7
      Miklos Szeredi authored
      
      
      Make data caching behavior selectable on a per-open basis instead of
      per-mount.  Compatibility for the old mount options 'kernel_cache' and
      'direct_io' is retained in the userspace library (version 2.4.0-pre1 or
      later).
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      45323fb7
    • Miklos Szeredi's avatar
      [PATCH] FUSE - direct I/O · 413ef8cb
      Miklos Szeredi authored
      
      
      This patch adds support for the "direct_io" mount option of FUSE.
      
      When this mount option is specified, the page cache is bypassed for
      read and write operations.  This is useful for example, if the
      filesystem doesn't know the size of files before reading them, or when
      any kind of caching is harmful.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      413ef8cb
    • Miklos Szeredi's avatar
      [PATCH] fuse: stricter mount option checking · 5a533682
      Miklos Szeredi authored
      
      
      Check for the presence of all mandatory mount options.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5a533682
    • Miklos Szeredi's avatar
      [PATCH] FUSE: tighten check for processes allowed access · 87729a55
      Miklos Szeredi authored
      
      
      This patch tightens the check for allowing processes to access non-privileged
      mounts.  The rational is that the filesystem implementation can control the
      behavior or get otherwise unavailable information of the filesystem user.  If
      the filesystem user process has the same uid, gid, and is not suid or sgid
      application, then access is safe.  Otherwise access is not allowed unless the
      "allow_other" mount option is given (for which policy is controlled by the
      userspace mount utility).
      
      Thanks to everyone linux-fsdevel, especially Martin Mares who helped uncover
      problems with the previous approach.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      87729a55
    • Miklos Szeredi's avatar
      [PATCH] FUSE - readpages operation · db50b96c
      Miklos Szeredi authored
      
      
      This patch adds readpages support to FUSE.
      
      With the help of the readpages() operation multiple reads are bundled
      together and sent as a single request to userspace.  This can improve
      reading performace.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      db50b96c
    • Miklos Szeredi's avatar
      [PATCH] FUSE - mount options · 1e9a4ed9
      Miklos Szeredi authored
      
      
      This patch adds miscellaneous mount options to the FUSE filesystem.
      
      The following mount options are added:
      
       o default_permissions:  check permissions with generic_permission()
       o allow_other:          allow other users to access files
       o allow_root:           allow root to access files
       o kernel_cache:         don't invalidate page cache on open
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      1e9a4ed9
    • Miklos Szeredi's avatar
      [PATCH] FUSE - file operations · b6aeaded
      Miklos Szeredi authored
      
      
      This patch adds the file operations of FUSE.
      
      The following operations are added:
      
       o open
       o flush
       o release
       o fsync
       o readpage
       o commit_write
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b6aeaded
    • Miklos Szeredi's avatar
      [PATCH] FUSE - read-write operations · 9e6268db
      Miklos Szeredi authored
      
      
      This patch adds the write filesystem operations of FUSE.
      
      The following operations are added:
      
       o setattr
       o symlink
       o mknod
       o mkdir
       o create
       o unlink
       o rmdir
       o rename
       o link
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9e6268db
    • Miklos Szeredi's avatar
      [PATCH] FUSE - read-only operations · e5e5558e
      Miklos Szeredi authored
      
      
      This patch adds the read-only filesystem operations of FUSE.
      
      This contains the following files:
      
       o dir.c
          - directory, symlink and file-inode operations
      
      The following operations are added:
      
       o lookup
       o getattr
       o readlink
       o follow_link
       o directory open
       o readdir
       o directory release
       o permission
       o dentry revalidate
       o statfs
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e5e5558e
    • Miklos Szeredi's avatar
      [PATCH] FUSE - device functions · 334f485d
      Miklos Szeredi authored
      
      
      This adds the FUSE device handling functions.
      
      This contains the following files:
      
       o dev.c
          - fuse device operations (read, write, release, poll)
          - registers misc device
          - support for sending requests to userspace
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      334f485d
    • Miklos Szeredi's avatar
      [PATCH] FUSE - core · d8a5ba45
      Miklos Szeredi authored
      
      
      This patch adds FUSE core.
      
      This contains the following files:
      
       o inode.c
          - superblock operations (alloc_inode, destroy_inode, read_inode,
            clear_inode, put_super, show_options)
          - registers FUSE filesystem
      
       o fuse_i.h
          - private header file
      
      Requirements
      ============
      
       The most important difference between orinary filesystems and FUSE is
       the fact, that the filesystem data/metadata is provided by a userspace
       process run with the privileges of the mount "owner" instead of the
       kernel, or some remote entity usually running with elevated
       privileges.
      
       The security implication of this is that a non-privileged user must
       not be able to use this capability to compromise the system.  Obvious
       requirements arising from this are:
      
        - mount owner should not be able to get elevated privileges with the
          help of the mounted filesystem
      
        - mount owner should not be able to induce undesired behavior in
          other users' or the super user's processes
      
        - mount owner should not get illegitimate access to information from
          other users' and the super user's processes
      
       These are currently ensured with the following constraints:
      
        1) mount is only allowed to directory or file which the mount owner
          can modify without limitation (write access + no sticky bit for
          directories)
      
        2) nosuid,nodev mount options are forced
      
        3) any process running with fsuid different from the owner is denied
           all access to the filesystem
      
       1) and 2) are ensured by the "fusermount" mount utility which is a
          setuid root application doing the actual mount operation.
      
       3) is ensured by a check in the permission() method in kernel
      
       I started thinking about doing 3) in a different way because Christoph
       H. made a big deal out of it, saying that FUSE is unacceptable into
       mainline in this form.
      
       The suggested use of private namespaces would be OK, but in their
       current form have many limitations that make their use impractical (as
       discussed in this thread).
      
       Suggested improvements that would address these limitations:
      
         - implement shared subtrees
      
         - allow a process to join an existing namespace (make namespaces
           first-class objects)
      
         - implement the namespace creation/joining in a PAM module
      
       With all that in place the check of owner against current->fsuid may
       be removed from the FUSE kernel module, without compromising the
       security requirements.
      
       Suid programs still interesting questions, since they get access even
       to the private namespace causing some information leak (exact
       order/timing of filesystem operations performed), giving some
       ptrace-like capabilities to unprivileged users.  BTW this problem is
       not strictly limited to the namespace approach, since suid programs
       setting fsuid and accessing users' files will succeed with the current
       approach too.
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d8a5ba45