Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Nested function definition question



On Wed, Jul 17, 2019 at 11:35 AM Nick Cross <zsh@xxxxxxxxx> wrote:
>
> On 17/07/2019 06:49, Roman Perepelitsa wrote:
> > Inlining functions makes a big difference because function calls are
> > very expensive in ZSH. Calling a function to do something trivial
> > takes ~10 times longer than doing the same thing inline.
>
> Really? Thats interesting. Is there any benchmarks available showing
> this?

Here's a quick one:

    local -i c=0

    function inc() { ((++c)) }

    function outofline() {
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
     inc; inc; inc; inc;
    }

    function inline() {
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
     ((++c)); ((++c)); ((++c)); ((++c));
    }

    time ( repeat 10000 outofline )
    time ( repeat 10000 inline )

I've got:

  ( repeat 10000; do; outofline; done; )  cpu 4.184 total
  ( repeat 10000; do; inline; done; )  cpu 0.171 total

Apparently, ((++c)) is 24 times faster than a call to function that
does the same thing. You can adapt this benchmark to measure something
closer to what you care about.

> So is it actually possible to do this?

It's definitely possible to inline functions by hand the same way I
did in the benchmark.

Another thing that can make a difference is manual loop unrolling,
although the overhead of looping isn't as big as that of function
calls. That is, you can replace this:

    function normal() {
     local -i i c
     for ((i = 0; i != 6400000; ++i)); do
       ((++c))
     done
    }

With this:

    function unrolled() {
     local -i i c
     for ((i = 0; i != 6400000; i+=64)); do
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
       ((++c)); ((++c)); ((++c)); ((++c));
     done
    }

I've gout about 3x speedup on my machine with this transformation.

It's quite rare in my experience that you can get to the point where
it makes sense to perform these low-level optimizations. You can
usually get decent performance by getting rid of all forks and
replacing loops with expansions. Array expansions are especially
powerful when it comes to making your scripts faster.

Roman.



Messages sorted by: Reverse Date, Date, Thread, Author