开发者

Shell: list directories ordered by file count (including in subdirectories)

I've nearly reached my limit for the permitted number of files in my Linux home directory, and I'm curious about where all the files are.

In any directory I can use for example find . -type f | wc -l to show a count of how many files are in that directory and in its subdirectories, but what I'd like is to be able to generate a complete list of all subdirectories (and sub-subdirectories etc) each with a count of all files contained in it and its su开发者_如何学运维bdirectories - if possible ranked by count, descending.

Eg if my file structure looks like this:

Home/
  file1.txt
  file2.txt
  Docs/
    file3.txt
    Notes/
      file4.txt
      file5.txt
    Queries/
      file6.txt
  Photos/
    file7.jpg

The output would be something like this:

7  Home
4  Home/Docs
2  Home/Docs/Notes
1  Home/Docs/Queries
1  Home/Photos

Any suggestions greatly appreciated. (Also a quick explanation of the answer, so I can learn from this!). Thanks.


I use the following command

find . -xdev -type f | cut -d "/" -f 2 | sort | uniq -c | sort -n

Which produces something like:

[root@ip-***-***-***-*** /]# find . -xdev -type f | cut -d "/" -f 2 | sort | uniq -c | sort -n
      1 .autofsck
      1 stat-nginx-access
      1 stat-nginx-error
      2 tmp
     14 boot
     88 bin
    163 sbin
    291 lib64
    597 etc
    841 opt
   1169 root
   2900 lib
   7634 home
  42479 usr
  80964 var


This should work:

find ~ -type d -exec sh -c "fc=\$(find '{}' -type f | wc -l); echo -e \"\$fc\t{}\"" \; | sort -nr

Explanation: In the command above will run "find ~ -type d" to find all the sub-directories the home-directory. For each of them, it runs a short shell script that finds the total number of files in that sub-directory (using the "find $dir -type f | wc -l" command that you already know), and will echo the number followed by the directory name. The sort command then runs to sort by the total number of files, in a descending order.

This is not the most efficient solution (you end up scanning the same directory many times), but I am not sure you can do much better with a one liner :-)


countFiles () {
    # call the recursive function, throw away stdout and send stderr to stdout
    # then sort numerically
    countFiles_rec "$1" 2>&1 >/dev/null | sort -nr
}

countFiles_rec () {
    local -i nfiles 
    dir="$1"

    # count the number of files in this directory only
    nfiles=$(find "$dir" -mindepth 1 -maxdepth 1 -type f -print | wc -l)

    # loop over the subdirectories of this directory
    while IFS= read -r subdir; do

        # invoke the recursive function for each one 
        # save the output in the positional parameters
        set -- $(countFiles_rec "$subdir")

        # accumulate the number of files found under the subdirectory
        (( nfiles += $1 ))

    done < <(find "$dir" -mindepth 1 -maxdepth 1 -type d -print)

    # print the number of files here, to both stdout and stderr
    printf "%d %s\n" $nfiles "$dir" | tee /dev/stderr
}


countFiles Home

produces

7 Home
4 Home/Docs
2 Home/Docs/Notes
1 Home/Photos
1 Home/Docs/Queries


simpler and more efficient:

find ~ -type f -exec dirname {} \; | sort | uniq -c | sort -nr


find . -type d -exec sh -c '(echo -n "{} "; ls {} | wc -l)' \; | sort -n -k 2

This is pretty efficient.

It will display the counts in ascending order (i.e. largest at the end). To get it is descending order, add the "-r" option to "sort".

If you run this command in the "/" directory, it will scan the entire filesystem and tell you what are the directories that contain the most files and sub-directories. It's a good way to see where all your inodes are being used.

Note: this will not work for directories that contain spaces, but you could modify it to work in that case, if it's a problem for you.


See following example: sort by column 2 reversely. Use sort -k 2 -r. -k 2 means sort with column 2 (space separated), -r means reverse.

# ls -lF /mnt/sda1/var/lib/docker/165536.165536/aufs/mnt/ | sort -k 2 -r
total 972
drwxr-xr-x   65 165536   165536        4096 Jun  5 12:23 ad45ea3c6a03aa958adaa4d5ad6fc25d31778961266972a69291d3664e3f4d37/
drwxr-xr-x   19 165536   165536        4096 Jun  6 06:46 7fa7f957669da82a8750e432f034be6f0a9a7f5afc0a242bb00eb8024f77d683/
drwxr-xr-x    2 165536   165536        4096 May  8 02:20 49e067ffea226cfebc8b95410e90c4bad6a0e9bc711562dd5f98b7d755fe6efb/
drwxr-xr-x    2 165536   165536        4096 May  8 01:19 45ec026dd49c188c68b55dcf98fda27d1f9dd32f825035d94849b91c433b6dd3/
drwxr-xr-x    2 165536   165536        4096 Mar 13 06:08 0d6e95d4605ab34d1454de99e38af59a267960999f408f720d0299ef8d90046e/
drwxr-xr-x    2 165536   165536        4096 Mar 13 02:25 e9b252980cd573c78065e8bfe1d22f01b7ba761cc63d3dbad284f5d31379865a/
drwxr-xr-x    2 165536   165536        4096 Mar 13 02:24 f4aa333b9c208b18faf00b00da150b242a7a601693197c1f1ca78b9ab2403409/
drwxr-xr-x    2 165536   165536        4096 Mar 13 02:24 3946669d530695da2837b2b5ed43afa11addc25232b29cc085a19c769425b36b/
drwxr-xr-x    2 165536   165536        4096 Mar 11 11:11 44293f77f63806a58d9b97c3c9f7f1397b6f0935e236250e24c9af4a73b3e35b/


If however you are fine with the non cumulative solution by using dirname (see answer of wjb) then by far more efficient is:

find ~ -type f -print0 | xargs -0 dirname | sort | uniq -c | sort -n

Note that this does not display empty dirs. For that you may do find ~ -type d -empty if your version of find supports it.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜