<<< Date Index >>>     <<< Thread Index >>>

[OT] Re: Changing Headers in the Compose screen



* On 2004.01.21, in <20040121074254.GM26679@xxxxxxx>,
*       "David Yitzchak Cohen" <lists+mutt_users@xxxxxxxxxxxxxx> wrote:
> > > 
> > > a small little program:
> > > filter <file | 
> > > read_everything_into_memory_and_spit_back_everything_on_eof >file
> > 
> > Kids, don't try this at home. This depends totally on the shell in
> > use, not on cat or r_e_i_m_a_s_b_e_o_e.
> 
> What's that?

It's read_everything_into_memory_and_spit_back_everything_on_eof....


> That's interesting ... I use plain old bash, and have done the above
> successfully ... weird. . .

I'm pretty sure I have too, on occasion. But more often it's overwritten
my input before anything happens.


> > This means that pretty much every shell is going to
> > overwrite your input file before anyone gets to read it. If anything
> > else happens, it's a scheduler fluke, and certainly not reliable.
> ...
> 
> For some reason, having only two cat(1)s doesn't do the trick nearly
> as often, but you'll notice that with three cat(1)s, we get the filter
> executed "in-place" more than 85% of the time - not too bad, eh?

Your second processor might explain that, somewhat.

For the command "cat <file | cat | cat >file", here's roughly what'll
happen system-wise:

shell   pipe()  = A                     # to connect stage3 to stage2
shell   pipe()  = B                     # to connect stage2 to stage1
shell   fork()  = shell2                # to create stage 3
shell   fork()  = shell3                # to create stage 2
shell   fork()  = shell4                # to create stage 1
shell2  open()                          # to open "file" for write
shell2  dup2()                          # to redirect stdout into "file"
shell2  dup2()                          # to connect pipe A to stdin
shell2  exec()                          # to execute stage 3 "cat"
shell3  dup2()                          # to connect stdout to pipe A
shell3  dup2()                          # to connect stdin to pipe B
shell3  exec()                          # to execute stage 2 "cat"
shell4  open()                          # to open "file" for read
shell4  dup2()                          # to connect stdout to pipe B
shell4  dup2()                          # to connect stdin to "file"
shell4  exec()                          # to execute stage 1 "cat"

Once those fork()s occur, the task execution can happen in any order.
(In theory, the shell's children (shell2-4) could signal themselves to
stop as soon as fork() completes, and the controlling shell could signal
them to resume at determined points, in order to control execution
order; but this might mess with i/o in unexpected ways, too, and would
be somewhat insane and needlessly complex, I imagine.)

So having multiple CPUs could mean that some of those shell children
get scheduled elsewhere, and shells will contend for CPU time with
some of the other shells. Shell4, the one that finally reads "file",
might actually execute open() before shell1 gets around to open()ing,
especially if shell1 is queued behind shell4 while shell2 and shell3 are
running or waiting. And if the scheduler gives priority to new processes
over old ones (e.g., it's optimized for task concurrency rather than
thread speed), you'll see this kind of command work more often than
otherwise.

I wonder whether your 85% success rate is inflated because of that
second CPU. If you turned it off, I expect your success rate would
drop.

Anyway. This is rather off-topic.... :/ Mainly I wanted to warn against
depending on that kind of command; a lot of data gets lost by people
who think that should work. If you do find yourself wanting to do this
kind of thing often, with programs other than perl, you need a different
kind of tool. As it happens, I've written something that can address
this, though it was mainly designed to solve a different problem:
see http://home.uchicago.edu/~dgc/sw/pipeline . But you can call it
read_everything_into_memory_and_spit_back_everything_on_eof if you like.
:)

-- 
 -D.    dgc@xxxxxxxxxxxx   **   Enterprise Network Servers and Such
                           **   University of Chicago
 We are the robots.        **   North America's southernmost seasonal glacier