Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: argv subscript range uses too many memory



[C code discussion proceeds below, so those zsh-users who don't care about
the internals can skip this message.  Once again, we should move the rest
of this thread to zsh-workers, thanks.]

Han, thanks for the diagnosis.

On Nov 20,  9:04pm, Han Pingtian wrote:
} Subject: Re: argv subscript range uses too many memory
}
} On Sat, Nov 10, 2012 at 06:57:09AM -0800, Bart Schaefer wrote:
} > In a loop, the heap allocations are not popped until the loop is done,
} > IIRC, so you'll end up with a large number of copies of the original
} > array in the heap with slice results pointing into different parts of
} > each copy.  Maybe there's a narrower scope in which a pushheap/popheap
} > could be inserted.
} 
} Looks like I have found the reason of this problem. If I revert this commit:
} 
} commit 61505654942cb9895a9811fde1dcbb662fd7d66a
} Author: Bart Schaefer <barts@xxxxxxxxxxxxxxxxxxxxx>
} Date:   Sat May 7 19:32:57 2011 +0000
} 
}     29175: optimize freeheap

Aha; this jibes with both the excerpted text from me above and also with
what PWS said in workers/30791:

: What's puzzling me is that loops, including the "while" involved here,
: execute freeheap() at the end of each iteration.  That should restore
: the pristine state of the loop

According to the comment in workers/29175:

+     * However, there doesn't seem to be any reason to reset fheap before
+     * beginning this loop.  Either it's already correct, or it has never
+     * been set and this loop will do it, or it'll be reset from scratch
+     * on the next popheap().  So all that's needed here is to pick up
+     * the scan wherever the last pass [or the last popheap()] left off.

The consequence of this optimization is that, in the name of speed, we
don't do a full-fledged garbage collection upon freeheap(), only upon
popheap().  So the freeheap() on each loop iteration does not "restore
the pristine state" and "a narrower scope [of] pushheap/popheap" would
be one potential solution.

Unfortunately as far as I can tell these two issues (the speed problem
in last year's "the source of slow large for loops" thread and the space
problem in this thread) are directly in conflict with one another.  The
speed problem requires that the heap not be fully garbage collected on
every loop pass, but the space problem requires that it be collected at
some point before the loop is done.

Maybe there's a hybrid where freeheap() can examine the difference in
position (fheaps - heaps) and do a full garbage collect only when the
heap has become "too full".  The question then is, what difference in
position is large enough to trigger a collection?



Messages sorted by: Reverse Date, Date, Thread, Author