<<< Date Index >>>     <<< Thread Index >>>

Re: Changes to the filesystem while find is running - comments?



James Youngman wrote:
> On Mon, Nov 22, 2004 at 10:05:04AM -0800, Martin Buchholz wrote:
> 
> 
>>Here's an idea to make this more robust in the face of
>>symlinks and automounters.
>>
>>Before a chdir to "foo", take stock:
>>- record stat(".");  DOTFD = open("."); (get a fd to ".")
>>- record stat("foo"); (make sure foo is a regular directory)
>>Then
>>- chdir "foo"
>>- stat("."); compare dev, inode with recorded stat("foo")
>>- if different, we suspect either symlinks or automounter.
> 
> 
> With the exception of having DOTFD, this is what GNU find currently
> does, and has done for some time.

DOTFD allows us to fchdir back to where we were more reliably.
Except that we use up an extra file descriptor.

>>In this case, go back to original directory.
>>- if we have fchdir, then
>>  fchdir(DOTFD); and try again.  
> 
> 
> Do you mean, just reissue chdir("foo"), or begin again with the stat()
> call?  If the latter, haven't we failed to spot an attempt to
> decieve find?
> 
> I'd like to complete this line of enquiry, because it's an answer to
> this that I'm really seeking.

I mean,
- first go back to the parent directory
- then lstat("foo"); check if it's a symlink or a real directory
- if a symlink, then this is fishy, but it could happen non-maliciously.
  I would issue a warning, then continue, without chdir'ing into foo.
- if a directory, then probably we've hit the automounter problem.
  Don't issue a warning; chdir("foo") again; this time it should work.
  If not, hmmmm... Perhaps we got a SIGSTOP at the wrong time and
  got restarted 10 minutes later....Try a third time; if that doesn't
  work, issue a warning, and continue without chdir'ing into foo.

Unlike replacing directories with symlinks, where the malicious
possibilities are evident, I don't see any malicious possibilities
arising out of mounted filesystems replaced by other filesystems.
If bad guys can mount filesystems in arbitrary locations, you're in
trouble anyways.

>>If we have fchdir, I see find as maintaining a stack of
>>file descriptors to directories that have been chdir'ed into.
> 
> 
> I can see that that would be useful but it would fail to comply with
> the POSIX standard, which specifies:
> 
>           The find utility shall be able to descend to arbitrary
>           depths in a file hierarchy and shall not fail due to path
>           length limitations (unless a path operand specified by the
>           application exceeds {PATH_MAX} requirements)

The above does not make it completely clear that find must be completely
free of non-path-length resource constraints.  Nevertheless, your point
that filedescriptors are depressingly, still a scarce resource, is
well taken.  I suppose you could play games with the file descriptor
limits, and on a system with either infinite or large limits,
use the stack of fd idea.  Or on a system where file descriptor limits
are per-process, you could use a stack of fds until you hit the resource
limit, then fall back to doing things the other way (i.e. chdir ("..")).

But that would be a lot of work to get right.

> 
>>Another idea:
>>
>>If we *always* use fchdir in place of chdir, we should
>>never risk chdir'ing into a symlink, since we always
>>check that the fd we get from open is a dir and not
>>a symlink.
> 
> 
> Of course, open(2) will follow a symlink, if the directory we
> originally stat()ed is replaced by a symlink just before we issue the
> open() call.  We of course can guard against that by issuing an
> lstat() on the fd once we have opened it.  

Hmmm... You're right. I guess you'd have to:
FD=open("foo");
ST1=lstat("foo");
ST2=fstat(FD);
compare(ST1,ST2);
fchdir(FD);

This might have the same performance characteristics as the
current implementation, since we save the stat(".") after
we chdir.

> Thanks for your thoughts, 
> James.

On a related note,
Solaris has some interesting non-standard functions:

     int openat(int fildes,  const  char  *path,  int  oflag,  /*
     mode_t mode */...);

     The openat() function is identical to  the  open()  function
     except that the path argument is interpreted relative to the
     starting point implied by the fd argument. If the  fd  argu-
     ment  has  the special value AT_FDCWD, a relative path argu-
     ment will be resolved relative to the current working direc-
     tory.  If  the path argument is absolute, the fd argument is
     ignored.


     int fstatat(int fildes, const char *path, struct stat  *buf,
     int flag);

     The fstatat() function obtains file  attributes  similar  to
     the  stat(),  lstat(),  and  fstat() functions.  If the path
     argument is a relative path, it is resolved relative to  the
     fildes  argument  rather than the current working directory.
     If path is absolute, the fildes argument is unused.  If  the
     fildes  argument  has the special value AT_FDCWD, defined in
     <fcntl.h>, relative paths  are  resolved  from  the  current
     working    directory.     If    the    flag    argument   is
     AT_SYMLNK_NOFOLLOW,   defined  in  <fcntl.h>,  the  function
     behaves  like lstat() and does not automatically follow sym-
     bolic links. See fsattr(5).


Thanks for your hard work maintaining this very important tool.

Martin