Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: copying a million small files between disks?



I ran into the same problem, and used find. I had to replace all
instances of ls in shell scripts:

ls *qj*par 2>|/dev/null|wc -l

with

find -maxdepth 1 -name \*qj\*par 2>|/dev/null|wc -l

Also, completion will be very slow so you might want to disable it if
pressing TAB is in your habit :)

On 11/1/07, Christopher Browne <cbbrowne@xxxxxxxxx> wrote:
> On 10/31/07, sam reckoner <sam.reckoner@xxxxxxxxx> wrote:
> > I'm not exaggerating. I have over one million small files that like to
> > move between disks. The problem is that even getting a directory
> > listing takes forever.
> >
> > Is there a best practice for this?
> >
> > I don't really need the directory listing, I just need to move all the
> > files. I have been using rsync, but that takes a very long time to
> > generate the list of files to be moved.
> >
> > Any alternatives?
>
> Yeah, I'd use find.
>
> The fundamental problem with ls, which you're clearly running into, is
> that when there are a million files, not only do you:
>
> a) Have to read the directory entries, but
>
> b) They will all have to be read into memory (in some form of array), and
>
> c) Then they get sorted (presumably generating a *second* array,
> though possibly not).
>
> You're getting your lunch eaten by b) and c).
>
> You might try:
>    "find /path/where/all/the/files/are | xargs cp -I {}
> /path/that/is/destination"
>
> That will skip steps b and c.
> --
> http://linuxfinances.info/info/linuxdistributions.html
> "...  memory leaks  are  quite acceptable  in  many applications  ..."
> (Bjarne Stroustrup, The Design and Evolution of C++, page 220)
>


-- 
Regards,



Messages sorted by: Reverse Date, Date, Thread, Author