Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I went through a phase when I really enjoyed writing shell scripts like

  ls *.jpg | awk '{print "resize 200x200 $1 thumbnails/$1"}' | bash
because I never got to the point where I could remember the strange punctuation that the shell requires for loops without looking up the info pages for bash whereas I've thoroughly internalized awk syntax.

Word is you should never write something like that because you'll never get the escaping right and somebody could craft inputs that would cause arbitrary code execution. I mean, they try to scare you into using xargs, but I find xargs so foreign I have to read the whole man page every time I want to do something with it.



Better is something like

  find . -maxdepth 1 -name "*.jpg" -exec resize 200x200 "{}" "thumbnails/{}" \;
which works for spaces and probably quotes in filenames I am not sure about other special characters.


It's tough to be portable and have a one liner See https://stackoverflow.com/questions/45181115/portable-way-to...

I switched the command to a graphics magick based resize since that's the tool these days, default quality is 75% (for JPEG), but is included as a commonly desired customization. ,, is from a different comment in this thread; it seems better self-documenting than the single , I'd traditionally use.

  find . -maxdepth 1 -name "*.jpg" -print0 |\
  xargs -0P $(nproc --all) -I,, gm convert resize '200x200^>' -quality 75 ,, "thumbnails/,,"


I encourage you to give it a try again. Almost every use of xargs that I ever did looked like this:

ls *.jpg | xargs -i,, resize 200x200 ,, thumbnails/,,

I just always define the placeholder to ,, (you can pick something else but ,, is nice and unique) and write commands like you do.


I'm more likely to write that like:

  for i in *.jpg; resize 200x200 "$i" "thumbnails/$i"; end


Does that not fail when you hit the maximum command line length? Doesn't the entirety of the directory get splatted? Isn't this the whole reason xargs exists?


No, it does not fail. Maximum command line length exists in the operating system, not the shell; you can't launch a program with too many argc and you can't launch a program with an argv that's a string that's too long.

But when you execute a for loop in bash/sh, the 'for' command is not a program that is launched; it's a keyword that's interpreted, and the glob is also interpreted.

Thus, no, that does not fail when you hit the maximum command line length (which is 4096 on most _nix). It'll fail at other limits, but those limits exist in bash and are much larger. If you want to move to a stream-processing approach to avoid any limits, then that is possible, while probably also being a sign you should not use the shell.


That's right. I tested this just now in a directory with 1,000,000 files:

  $ for i in *; do echo $i; done | wc -l
  1000000
I'm a little bummed that it failed in fish shell, but wouldn't begrudge the author if they replied "don't do that".


The for loop only runs resize once per file. So no, the entire directory does not get splatted. It is unlikely you'd hit maximum command length.

At least on mac, the max command length is 1048576 bytes, while the maximum path length in the home directory is 1024 bytes. There might be some unix variant where the max path length is close enough to the max command length to cause an overflow, but I doubt that is the case for common ones.

xargs exists in an attempt to be able to parse command output. You could for instance have awk output xargs formatted file names to build up a single command invocation from arbitrary records read by awk. Note that xargs still has to obey the command line length limit though, because the command line needs to get passed to the program. Thus, in a situation where this for loop overflows the command line, it would cause xargs to also fail. Thus I would always use globbing if I have the choice.

EDIT: If you mean that the directory is splatted in the for loop, then in a theoretical sense it is. However, since "for" is a shell builtin, it does not have to care about command line length limits to my knowledge.


Yes, this is an issue, absolutely.

I've seen some image directories with more than a million files in them.


This shouldn't overrun the command line length for resize, since resize only gets fed one filename at a time. I do think that the for loop would need to hold all the filenames in a naive shell implementation. (I would assume most shells are naive in this respect) The for loop's length limit is probably the amount of ram available though. I find it improbable that one could overflow ram with purely pathnames on a PC, since a million files times 100 chars per file is still less than a gig of ram. If that was an issue though, one would indeed have to use "find" with "-exec" instead to make sure that one was never holding all file names in memory at the same time.


Exactly, there are so many limits in the shell that I don’t want to be bothered to think about. When I get serious I just write Python.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: