开发者

Script to distribute a large number of files in to smaller groups

I have folders containing large numbers of files (e.g. 1000+) of various sizes which I want to move in to smaller groups of, say, 100 files per folder.

I wrote an Apple Script which counted the files, created a numbered subfolder, and then moved 100 files in to the new f开发者_运维百科older (the number of files could be specified) which looped until there were less than specified number of files which it moved in to the last folder it created.

The problem was that it ran horrendously slowly. I'm looking for either an Apple Script or shell script I can run on my MacBook and/or Linux box which will efficiently move the files in to smaller groups.

How the files are grouped is not particularly significant, I just want fewer files in each folder.


This should get you started:

DIR=$1
BATCH_SIZE=$2
SUBFOLDER_NAME=$3
COUNTER=1

while [ `find $DIR -maxdepth 1 -type f| wc -l` -gt $BATCH_SIZE ] ; do
  NEW_DIR=$DIR/${SUBFOLDER_NAME}${COUNTER}
  mkdir $NEW_DIR
  find $DIR -maxdepth 1 -type f | head -n $BATCH_SIZE | xargs -I {} mv {} $NEW_DIR
  let COUNTER++
if [ `find $DIR -maxdepth 1 -type f| wc -l` -le $BATCH_SIZE ] ; then
  mkdir $NEW_DIR
  find $DIR -maxdepth 1 -type f | head -n $BATCH_SIZE | xargs -I {} mv {} $NEW_DIR
fi
done

The nested if statement gets the last remaining files. You can add some additional checks as you see needed after you modify for your use.


This is a tremendous kludge, but it shouldn't be too terribly slow:

rm /tmp/counter*
touch /tmp/counter1
find /source/dir -type f -print0 | 
    xargs -0 -n 100 \
        sh -c 'n=$(echo /tmp/counter*); \
               n=${n#/tmp/counter}; \
               counter="/tmp/counter$n"; \
               mv "$counter" "/tmp/counter$((n+1))"; \
               mkdir "/dest/dir/$n"; \
               mv "$@" "/dest/dir/$n"' _

It's completely indiscriminate as to which files go where.


The most common way to solve the problem of directories with too many files in them is to subdivide by the the first couple characters of the name. For example:

Before:

aardvark
apple
architect
...
zebra
zork

After:

a/aardvark
a/apple
a/architect
b/...
...
z/zebra
z/zork

If that isn't subdividing well enough, then go one step further:

a/aa/aardvark
a/ap/apple
a/ar/architect
...
z/ze/zebra
z/zo/zork

This should work quite quickly, because the move command that your script executes can use simple glob expansion to select all the files to move, ala mv aa* a/aa, as opposed to having to individually run a move command on each file (which would be my first guess as to why the original script was slow)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜