Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Compare two (or more) filenames and return what is common between them



What I am trying to do:

Given a folder/directory full of files (and, possibly, some existing
folders/directories), I want to create folders which will group files
with similar files names, but which will leave folders alone.

For example, here’s a list of files from a directory:

@TUAW Adding Low Power Bluetooth to your older Mac.md
all-tuaw-posts-with-titles.txt
Fluid_1.8.zip
FSB- Yellow King.md
fsb-followup.md
iCXItGrjqrw --- ATP_Ending_Theme_Song_A_Day_1546 18_-_640x360 18.mp4
iCXItGrjqrw --- ATP_Ending_Theme_Song_A_Day_1546 18_-_640x360
18.mp4.description.txt
Narrative Lectionary 2013-2014 Readings for Year 4 (Gospel of John).pdf
Narrative Lectionary Summer 2014.pdf
qq Set a Mac's Hostname in Terminal.txt
rss-audio-template-index.xml
rsync-skip-compress.txt
Tumblr Private Posts.txt

afterwards, I would like all of the above files to have been sorted
into these folders:

@TUAW Adding Low Power Bluetooth to your older Mac
all-tuaw-posts-with-titles
Fluid_1.8
FSB- Yellow King
fsb-followup
iCXItGrjqrw --- ATP_Ending_Theme_Song_A_Day_1546 18_-_640x360 18
Narrative Lectionary
qq Set a Mac's Hostname in Terminal
rss-audio-template-index
rsync-skip-compress
Tumblr Private Posts

The only really tricky part here is for a few of the files which share
part or all of their filename:

iCXItGrjqrw --- ATP_Ending_Theme_Song_A_Day_1546 18_-_640x360 18.mp4
iCXItGrjqrw --- ATP_Ending_Theme_Song_A_Day_1546 18_-_640x360
18.mp4.description.txt

(which should both go into a folder "iCXItGrjqrw ---
ATP_Ending_Theme_Song_A_Day_1546 18_-_640x360 18”)

(Let’s called this “Case #1”)

and

Narrative Lectionary 2013-2014 Readings for Year 4 (Gospel of John).pdf
Narrative Lectionary Summer 2014.pdf

(which should both go into a folder "Narrative Lectionary”)

(Let’s called this “Case #2”)

Also notice that "rss-audio-template-index.xml” and
"rsync-skip-compress.txt” should _not_ go into a folder called “rs”

(Let’s called this “Case #3”)

Case #1 seems like it should be pretty easy, because all I would have
to do is take off the extension(s) and both of the files have the same
“root” so I guess I could match that somehow, but I’m not exactly sure
how since one file has ".mp4.description.txt” and one file has “.mp4”

Case #2 - I am not sure how to efficiently match those two… I guess I
could start comparing letters of each filename and then stop when they
don’t match, but I’m not even sure how to do that. (And I just
realized that I would not want a trailing space in the folder name
either so it would have to be smart enough to deal with that somehow
too so I end up with a folder named "Narrative Lectionary” not
"Narrative Lectionary ”!)

For "Case #3” I guess I need to set some sort of minimum number of
characters to be matched. Maybe… 5? I don’t really know how to deal
with that case very well.

Has anyone already invented this?

If not, can anyone suggest how I might go about doing this? I’ve been
trying to come up with something and I’m just at a complete loss to
know where to start, and I get the strong suspicion that there might
be a zsh feature that would help that I just don’t know about.

Thanks for any help you can offer.

TjL



Messages sorted by: Reverse Date, Date, Thread, Author