Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Idea for optimization (use case: iterate string with index parameter)



On Fri, Jan 5, 2018 at 5:38 AM, Sebastian Gniazdowski
<psprint@xxxxxxxxxxx> wrote:
> iterating string with index parameter is quite slow, because unicode characters are skipped and counted using mbrtowc().

I can't remember the last time I needed to do that kind of iteration.

> For example, I saw z-sy-h uses such loops, my projects sometimes use them too. The point is that iterating a string and doing something with letters, e.g. counting brackets, is a very common use case, and the optimization would be triggered often.

Hmm.  Whether this is worthwhile depends on the size of the typical
processed string.  I can see this affecting z-sy-h when e.g. running
zed on a big function, but probably not when editing a typical command
line.

Maybe it would be reasonable to do something in shell code, e.g.:

typeset -a iter=(${(s//)string})
for ((i=1; i <= $#iter; i++)); do something with $iter[i]; done
string=${(j//)iter} # if needed

That is more memory-intensive, of course, but it also assists with
cases of unordered access into the array of characters.

> In general, the array would hold #N (5-10 or so) last string-index requests. If new request would target the same string, but index greater by 1, getarg() would call mbrtowc() once (via MB_METACHARLEN macro) reusing the previous in-string pointer.

Why only when greater by 1?  If greater, scan to and record the next
needed position.  Same number of mbrtowc() conversions, overall.



Messages sorted by: Reverse Date, Date, Thread, Author