Re: Changes to the filesystem while find is running - comments?
On Mon, Nov 22, 2004 at 10:05:04AM -0800, Martin Buchholz wrote:
> Here's an idea to make this more robust in the face of
> symlinks and automounters.
>
> Before a chdir to "foo", take stock:
> - record stat("."); DOTFD = open("."); (get a fd to ".")
> - record stat("foo"); (make sure foo is a regular directory)
> Then
> - chdir "foo"
> - stat("."); compare dev, inode with recorded stat("foo")
> - if different, we suspect either symlinks or automounter.
With the exception of having DOTFD, this is what GNU find currently
does, and has done for some time.
> In this case, go back to original directory.
> - if we have fchdir, then
> fchdir(DOTFD); and try again.
Do you mean, just reissue chdir("foo"), or begin again with the stat()
call? If the latter, haven't we failed to spot an attempt to
decieve find?
I'd like to complete this line of enquiry, because it's an answer to
this that I'm really seeking.
> - If we don't have fchdir, getting back to the parent might be
> tougher. In the case of the automounter, we can do chdir(".."),
> then stat(".") and check that we're back in original directory.
> If that doesn't work, we chdir("/absolute/real/path/to/parent"),
> again stat(".") and compare dev/inode with saved stat of parent
> directory.
In the worst possible case we can fchdir() back to the directory from
which find was invoked, which we need to keep in order to support
-exec.
> If we have fchdir, I see find as maintaining a stack of
> file descriptors to directories that have been chdir'ed into.
I can see that that would be useful but it would fail to comply with
the POSIX standard, which specifies:
The find utility shall be able to descend to arbitrary
depths in a file hierarchy and shall not fail due to path
length limitations (unless a path operand specified by the
application exceeds {PATH_MAX} requirements)
> Another idea:
>
> If we *always* use fchdir in place of chdir, we should
> never risk chdir'ing into a symlink, since we always
> check that the fd we get from open is a dir and not
> a symlink.
Of course, open(2) will follow a symlink, if the directory we
originally stat()ed is replaced by a symlink just before we issue the
open() call. We of course can guard against that by issuing an
lstat() on the fd once we have opened it.
However, we should then compare the results of that fstat() against
the result we expected when we did stat("foo"). We therefore STILL
have to decide what to do if the st_dev value has changed. In other
words, fchdir() might be useful but doesn't help us resolve our
problem, I think.
> The functional programming people might suggest,
> instead of a stack of open fd, forking and chdiring
> whenever a subdir is explored.
That would also fail to comply with POSIX, I think.
Thanks for your thoughts,
James.