Saturday, May 10, 2008

Bash Shell Hack: Picking a random set of files from a directory

I needed to test out some work just completed, and to do so, I needed to run some files against a process. Problem is, there are 1000 files to choose from, and I only want to test a handful.

But which handful? Ideally, it would be random.

So here's my quick shell hack to give me 15 random files from a given directory:

 ls | while read x; do echo "`expr $RANDOM % 1000`:$x"; done \
     | sort -n| sed 's/[0-9]*://' | head -15

This works by:

  • Generating a list of every file in the directory
  • Reading in each file into the variable x
  • I prepend a random number (thanks to the magic $RANDOM variable in bash
  • I then sort that listing numerically - which means sorting based on the random number
  • I then strip off the random number
  • I show only 15 items from the stream

Of course, this isn't efficient - but it doesn't need to be. Heck, it took longer to write up the blog post describing it, then it did to actually write the code.

I so love Unix. And I'm running this under cygwin on a Windows box. See, there's no excuse for not using Unix.

15 comments:

  1. Richard10:10 AM

    creative script, and a nice photo of you and your girlfriend by the way. I used in my script a trick I read somewhere else: the -R option of GNU sort does a random sort. But that option is not in the POSIX specification of sort on unix.org

    ReplyDelete
  2. Anonymous4:09 PM

    it works. thanks!

    ReplyDelete
  3. Anonymous1:42 PM

    sed definitely works in this instance, but perhaps using cut would be more elegant?

    eg:
    cut -d':' -f2-

    Just my two cents; in the end, as long as it works, it's all good.

    Thanks for this; it's definitely helping!

    ReplyDelete
  4. I always forget about cut -- that's a good point.

    -Ben

    ReplyDelete
  5. Anonymous6:42 PM

    That's pretty elegant. Thanks

    ReplyDelete
  6. Anonymous7:49 PM

    Just found out about this (does same thing):

    ls |sort -R | head -n5

    ReplyDelete
  7. When using in something like a 'find' command, it helps to use 'expr' as such:

    find /home/ -name "*.mp3" -exec \
    sh -c 'echo `expr $RANDOM` {}' ";" \
    | sort -g\
    | awk '{print $2}'

    ReplyDelete
  8. ls | sort -R | head -1

    ReplyDelete
  9. ls | sort -R | head -15

    ReplyDelete
  10. How would one then have the script open the output? Thanks for your help.

    Evan

    ReplyDelete
    Replies
    1. Anonymous7:46 PM

      you can do what you like by using a while read e.g.
      ls | sort -R | head -15 | while read file
      do
      cat $file
      done

      Delete
  11. Raamkum5:11 PM

    Great. This is works like a gem - Is it possible to also copy the file name shown on $x variable to a different folder? if so how?

    ReplyDelete
    Replies
    1. Anonymous7:48 PM

      yes:
      ls | sort -R | head -15 | while read file
      do
      cp $file /tmp/
      done

      Just replace /tmp/ with which ever folder you want to copy to.

      Delete
  12. Just do it more efficiently by omitting `expr $RANDOM % 1000` which gives no advantage:

    ls | while read x; do echo "$RANDOM:$x"; done | sort -n|cut -d':' -f2| head -15

    time results with just "$RANDOM:$x"
    real 0m0.04s
    user 0m0.04s
    sys 0m0.05s

    time results with "`expr $RANDOM % 1000`:$x":
    real 0m1.85s
    user 0m0.50s
    sys 0m1.32s

    ReplyDelete
  13. ladiko12:42 PM

    What about the easy way?

    find -type f | shuf -n1

    ReplyDelete