Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Array appends are quadratic



Bart Schaefer wrote on Fri, Nov 27, 2015 at 02:59:58 -0800:
> On Nov 27,  9:26am, Mikael Magnusson wrote:
> } Subject: Re: Array appends are quadratic
> }
> } According to your debug prints, the array is reallocated every time it
> } is assigned to, even when the required element is already allocated.
> } Doubling the allocation size would do nothing to fix that, presumably.
> 
> Right, this is generic code to handle array splicing, so starting from
> an array with 200 elements,
> 
>     a[100]=foo
> 
> and
> 
>     a[50,150]=(foo)
> 
> use the same loop even though the former does not change the size of
> the resulting array and the latter will waste 100 slots at the top.

(The reason it'll waste 100 slots is that it will allocate 200 elements
even though only 200-(150-50+1) are needed.)

> I vaguely recall a conversation in which we decided that fewer code
> branches was better than attempting to efficiently re-use array slots.

I'm not looking to implement sparse arrays.   I just wish it didn't
reallocate unless it needed to.  That _would_ mean some additional code
complexity, but it seems it would be manageable: having the array
remember the number of allocated slots, using memmove() to shift the
past-the-slice portion, copying the slice directly into the slots.

That should make this part of the code O(slice) rather than O(array
length).

The effort involved in implementing the redoubling allocation (for the
other problem indicated in 37236) would be similar: mainly having the
array remember the number of allocated slots.  I'll have to track done
all places in the code that allocate/reallocate arrays, hopefully there
aren't too many.

There may be other parts in the code that do O(n) processing instead
of O(1) — Mikael points out arrlen() as an example — but those can be
improved separately.

> } % a[100]=foo
> }  params.c:2585: setarrvalue: Reallocating 'a' to size 100
> } % a[5]=bar
> }  params.c:2585: setarrvalue: Reallocating 'a' to size 101
> } % a[10]=bar
> }  params.c:2585: setarrvalue: Reallocating 'a' to size 101
> } 
> } Do we have an... off-by-two bug somewhere?
> 
> No, it just always allocates 1 more than the largest occupied array
> index position if the array isn't empty to begin with.
> 

Cheers,

Daniel

P.S. It seems that bin_compquote is calling ztrdup() on every element of
an array parameter passed as an argument, so it would reallocate all array
members even if none of them need quoting.  I'm not planning to look
into it further, though, since I haven't run into it in practice yet.



Messages sorted by: Reverse Date, Date, Thread, Author