Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: Sorting file names randomly

    Hi Bart :)

 * Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> dixit:
> On Jul 24,  9:39am, DervishD wrote:
> } >       for ((i=1; i <= $#; ++i)) { reply[i*RANDOM/32768+1]+=($argv[i]) }
> }     Why is it better than my function?
> It's shorter (which is one of the things you asked for), and it only
> does array processing rather than building up and tearing down strings.

    Which is much more slower.
> }     Anyway, the ordering of elements in an associative array is not
> } very random if $RANDOM is not included in the key, and I don't
> } understand it :?? How are associative arrays elements sorted?
> Are you familiar with the concept of hash tables?  That's how nearly
> all languages that have associative arrays, implement them, and in
> many cases (e.g. Perl) they're even called "hashes" by the language.

    I haven't took a look at zsh sources (well, I've done it at some
points, but never a general look), so I didn't assume you were using
hash tables for associative arrays. Thanks for the explanation :)

> } function shuffle () {
> } 
> }     setopt nullglob globdots rcexpandparam
> }     
> }     reply=()
> }     reply=($*)
> Don't you mean $~* there?  Otherwise you have the problem with
> multiple directories that you alluded to once before.

    No, I've got rid of the noglob thing, thanks to your idea :)
> }     reply=($reply(e:'REPLY="${(l.5..0.)RANDOM} $REPLY"':))
> This is wasteful in a number of ways.

    OK, let's see :)
> First, the (l.5..0.) is just left-zero-padding $RANDOM, so rather than
> force the shell parse that and work out what to do once for every file
> name, it would be better to declare "local +h -Z 5 RANDOM" as I did.
> (Just remember to seed RANDOM when making it local.)

    Mmm, I didn't knew you can make a predefined shell parameter
(like RANDOM is) 'local', so I didn't the -Z thing. But thanks for
illustrating this, because is VERY useful :)))

    Just to make sure: you can do whatever thing you want with
'typeset' on a predefined shell parameter just like you would do with
your own parameters, right? Any important limitation?

> Second, by using a glob qualifier, you're forcing the shell to stat()
> every file name second time, after it has already been done once when
> reply=($~*) is assigned [assuming $~* is what you meant].

    Oh, crap, I didn't thought about this neither. Obviously a glob
modifier HAS to stat the file name to see if it is a regular file,
directory, has N links, and whatever other tests you want to carry :(

> Third, you're doing string concatenation, adding six bytes for each
> file name.  If you're worried about exceeding argument limits, you
> ought to be worried about how much extra memory that eats.

    Exceeding argument limits is one thing, because no matter how
many resources do you have, if the command line size limit is 256k,
that's all you're going to get. OTOH, memory usage is not an issue,
the script does not run at arbitrary times, if the memory in the
machine is stressed, I would probably not run the script (or shell
function, or whatever).

    That's the reason I was using memory freely inside the shell
function, I was not bothered by resource usage.

> Fourth, you've eventually got to do this ...
> }     reply=(${reply/#????? /})
> ... which has to copy every string in order to pattern-match it and
> chop it up before assigning it back again, so you're roughly
> doubling the memory needed right there, possibly as much as
> tripling it if I recall correctly how array assignments are
> performed.

    Here I assumed that the array was processed one element at a time
so I didn't consider that the memory usage doubled. Cool :)))
> My hash solution isn't very much less memory intensive (if you skipped
> the final assignment to the reply array and just printed the values it
> would be better); but the += version is about as small a footprint as
> you're going to get, because inserting array slices only copies the new
> elements being inserted (everything else is moving of pointers to the
> existing elements).

    Cool!. I'm going to use your += solution, thanks a lot :)
> }     print -l $reply
> } 
> }     return 0
> Unless you expect "print" to fail, the "return 0" is redundant.

    I know, but I use a template for shell functions and shell
scripts, and it always do an 'emulate -L zsh' at the beginning and
'return 0' at the end O:)
> } >     reply=($*)
> } >     reply=($reply(e:'REPLY="${(l.5..0.)RANDOM} $REPLY"':))
> } >     reply=(${(o)reply})
> }     How could I avoid doing this? I cannot put the 'o' in the
> } assignment above this one because it doesn't work, it seems to sort
> } *before* applying the 'e' glob modifier).
> Obviously the glob applies after any sorting in that second assignment.

    That wasn't obvious to me. I probably assumed left-to-right
processing, inconciously.

    Bart, thanks a lot for your examples, but LOT'S of thanks for
your explanations. Really, you've teached me a lot about shell
scripting, not only in this message, but over almost four years in
this mailing list. My 'mobs' project wouldn't have been possible
without your help and your kindness when explaining things. I really
owe you a lot.

    Raúl Núñez de Arenas Coronado

Linux Registered User 88736 | http://www.dervishd.net
http://www.pleyades.net & http://www.gotesdelluna.net
It's my PC and I'll cry if I want to...

Messages sorted by: Reverse Date, Date, Thread, Author