Re: Changes to the filesystem while find is running - comments?
James Youngman wrote:
> On Mon, Nov 22, 2004 at 10:05:04AM -0800, Martin Buchholz wrote:
>
>
>>Here's an idea to make this more robust in the face of
>>symlinks and automounters.
>>
>>Before a chdir to "foo", take stock:
>>- record stat("."); DOTFD = open("."); (get a fd to ".")
>>- record stat("foo"); (make sure foo is a regular directory)
>>Then
>>- chdir "foo"
>>- stat("."); compare dev, inode with recorded stat("foo")
>>- if different, we suspect either symlinks or automounter.
>
>
> With the exception of having DOTFD, this is what GNU find currently
> does, and has done for some time.
DOTFD allows us to fchdir back to where we were more reliably.
Except that we use up an extra file descriptor.
>>In this case, go back to original directory.
>>- if we have fchdir, then
>> fchdir(DOTFD); and try again.
>
>
> Do you mean, just reissue chdir("foo"), or begin again with the stat()
> call? If the latter, haven't we failed to spot an attempt to
> decieve find?
>
> I'd like to complete this line of enquiry, because it's an answer to
> this that I'm really seeking.
I mean,
- first go back to the parent directory
- then lstat("foo"); check if it's a symlink or a real directory
- if a symlink, then this is fishy, but it could happen non-maliciously.
I would issue a warning, then continue, without chdir'ing into foo.
- if a directory, then probably we've hit the automounter problem.
Don't issue a warning; chdir("foo") again; this time it should work.
If not, hmmmm... Perhaps we got a SIGSTOP at the wrong time and
got restarted 10 minutes later....Try a third time; if that doesn't
work, issue a warning, and continue without chdir'ing into foo.
Unlike replacing directories with symlinks, where the malicious
possibilities are evident, I don't see any malicious possibilities
arising out of mounted filesystems replaced by other filesystems.
If bad guys can mount filesystems in arbitrary locations, you're in
trouble anyways.
>>If we have fchdir, I see find as maintaining a stack of
>>file descriptors to directories that have been chdir'ed into.
>
>
> I can see that that would be useful but it would fail to comply with
> the POSIX standard, which specifies:
>
> The find utility shall be able to descend to arbitrary
> depths in a file hierarchy and shall not fail due to path
> length limitations (unless a path operand specified by the
> application exceeds {PATH_MAX} requirements)
The above does not make it completely clear that find must be completely
free of non-path-length resource constraints. Nevertheless, your point
that filedescriptors are depressingly, still a scarce resource, is
well taken. I suppose you could play games with the file descriptor
limits, and on a system with either infinite or large limits,
use the stack of fd idea. Or on a system where file descriptor limits
are per-process, you could use a stack of fds until you hit the resource
limit, then fall back to doing things the other way (i.e. chdir ("..")).
But that would be a lot of work to get right.
>
>>Another idea:
>>
>>If we *always* use fchdir in place of chdir, we should
>>never risk chdir'ing into a symlink, since we always
>>check that the fd we get from open is a dir and not
>>a symlink.
>
>
> Of course, open(2) will follow a symlink, if the directory we
> originally stat()ed is replaced by a symlink just before we issue the
> open() call. We of course can guard against that by issuing an
> lstat() on the fd once we have opened it.
Hmmm... You're right. I guess you'd have to:
FD=open("foo");
ST1=lstat("foo");
ST2=fstat(FD);
compare(ST1,ST2);
fchdir(FD);
This might have the same performance characteristics as the
current implementation, since we save the stat(".") after
we chdir.
> Thanks for your thoughts,
> James.
On a related note,
Solaris has some interesting non-standard functions:
int openat(int fildes, const char *path, int oflag, /*
mode_t mode */...);
The openat() function is identical to the open() function
except that the path argument is interpreted relative to the
starting point implied by the fd argument. If the fd argu-
ment has the special value AT_FDCWD, a relative path argu-
ment will be resolved relative to the current working direc-
tory. If the path argument is absolute, the fd argument is
ignored.
int fstatat(int fildes, const char *path, struct stat *buf,
int flag);
The fstatat() function obtains file attributes similar to
the stat(), lstat(), and fstat() functions. If the path
argument is a relative path, it is resolved relative to the
fildes argument rather than the current working directory.
If path is absolute, the fildes argument is unused. If the
fildes argument has the special value AT_FDCWD, defined in
<fcntl.h>, relative paths are resolved from the current
working directory. If the flag argument is
AT_SYMLNK_NOFOLLOW, defined in <fcntl.h>, the function
behaves like lstat() and does not automatically follow sym-
bolic links. See fsattr(5).
Thanks for your hard work maintaining this very important tool.
Martin