How to count the number of files in a directory using Python
How do I count only the files in a directory? This counts the director开发者_JAVA百科y itself as a file:
len(glob.glob('*'))
os.listdir()
will be slightly more efficient than using glob.glob
. To test if a filename is an ordinary file (and not a directory or other entity), use os.path.isfile()
:
import os, os.path
# simple version for working with CWD
print len([name for name in os.listdir('.') if os.path.isfile(name)])
# path joining version for other paths
DIR = '/tmp'
print len([name for name in os.listdir(DIR) if os.path.isfile(os.path.join(DIR, name))])
import os
_, _, files = next(os.walk("/usr/lib"))
file_count = len(files)
For all kind of files, subdirectories included (Python 2):
import os
lst = os.listdir(directory) # your directory path
number_files = len(lst)
print number_files
Only files (avoiding subdirectories):
import os
onlyfiles = next(os.walk(directory))[2] #directory is your directory path as string
print len(onlyfiles)
This is where fnmatch comes very handy:
import fnmatch
print len(fnmatch.filter(os.listdir(dirpath), '*.txt'))
More details: http://docs.python.org/2/library/fnmatch.html
If you want to count all files in the directory - including files in subdirectories, the most pythonic way is:
import os
file_count = sum(len(files) for _, _, files in os.walk(r'C:\Dropbox'))
print(file_count)
We use sum that is faster than explicitly adding the file counts (timings pending)
An answer with pathlib and without loading the whole list to memory:
from pathlib import Path
path = Path('.')
print(sum(1 for _ in path.glob('*'))) # Files and folders, not recursive
print(sum(1 for _ in path.glob('**/*'))) # Files and folders, recursive
print(sum(1 for x in path.glob('*') if x.is_file())) # Only files, not recursive
print(sum(1 for x in path.glob('**/*') if x.is_file())) # Only files, recursive
Short and simple
import os
directory_path = '/home/xyz/'
No_of_files = len(os.listdir(directory_path))
I am surprised that nobody mentioned os.scandir
:
def count_files(dir):
return len([1 for x in list(os.scandir(dir)) if x.is_file()])
import os
print len(os.listdir(os.getcwd()))
def directory(path,extension):
list_dir = []
list_dir = os.listdir(path)
count = 0
for file in list_dir:
if file.endswith(extension): # eg: '.txt'
count += 1
return count
This uses os.listdir
and works for any directory:
import os
directory = 'mydirpath'
number_of_files = len([item for item in os.listdir(directory) if os.path.isfile(os.path.join(directory, item))])
this can be simplified with a generator and made a little bit faster with:
import os
isfile = os.path.isfile
join = os.path.join
directory = 'mydirpath'
number_of_files = sum(1 for item in os.listdir(directory) if isfile(join(directory, item)))
While I agree with the answer provided by @DanielStutzbach: os.listdir()
will be slightly more efficient than using glob.glob
.
However, an extra precision, if you do want to count the number of specific files in folder, you want to use len(glob.glob())
. For instance if you were to count all the pdfs in a folder you want to use:
pdfCounter = len(glob.glob1(myPath,"*.pdf"))
This is an easy solution that counts the number of files in a directory containing sub-folders. It may come in handy:
import os
from pathlib import Path
def count_files(rootdir):
'''counts the number of files in each subfolder in a directory'''
for path in pathlib.Path(rootdir).iterdir():
if path.is_dir():
print("There are " + str(len([name for name in os.listdir(path) \
if os.path.isfile(os.path.join(path, name))])) + " files in " + \
str(path.name))
count_files(data_dir) # data_dir is the directory you want files counted.
You should get an output similar to this (with the placeholders changed, of course):
There are {number of files} files in {name of sub-folder1}
There are {number of files} files in {name of sub-folder2}
def count_em(valid_path):
x = 0
for root, dirs, files in os.walk(valid_path):
for f in files:
x = x+1
print "There are", x, "files in this directory."
return x
Taked from this post
import os
def count_files(in_directory):
joiner= (in_directory + os.path.sep).__add__
return sum(
os.path.isfile(filename)
for filename
in map(joiner, os.listdir(in_directory))
)
>>> count_files("/usr/lib")
1797
>>> len(os.listdir("/usr/lib"))
2049
Luke's code reformat.
import os
print len(os.walk('/usr/lib').next()[2])
Here is a simple one-line command that I found useful:
print int(os.popen("ls | wc -l").read())
one liner and recursive:
def count_files(path):
return sum([len(files) for _, _, files in os.walk(path)])
count_files('path/to/dir')
I used glob.iglob
for a directory structure similar to
data
└───train
│ └───subfolder1
│ | │ file111.png
│ | │ file112.png
│ | │ ...
│ |
│ └───subfolder2
│ │ file121.png
│ │ file122.png
│ │ ...
└───test
│ file221.png
│ file222.png
Both of the following options return 4 (as expected, i.e. does not count the subfolders themselves)
len(list(glob.iglob("data/train/*/*.png", recursive=True)))
sum(1 for i in glob.iglob("data/train/*/*.png"))
It is simple:
print(len([iq for iq in os.scandir('PATH')]))
it simply counts number of files in directory , i have used list comprehension technique to iterate through specific directory returning all files in return . "len(returned list)" returns number of files.
import os
total_con=os.listdir('<directory path>')
files=[]
for f_n in total_con:
if os.path.isfile(f_n):
files.append(f_n)
print len(files)
If you'll be using the standard shell of the operating system, you can get the result much faster rather than using pure pythonic way.
Example for Windows:
import os
import subprocess
def get_num_files(path):
cmd = 'DIR \"%s\" /A-D /B /S | FIND /C /V ""' % path
return int(subprocess.check_output(cmd, shell=True))
I found another answer which may be correct as accepted answer.
for root, dirs, files in os.walk(input_path):
for name in files:
if os.path.splitext(name)[1] == '.TXT' or os.path.splitext(name)[1] == '.txt':
datafiles.append(os.path.join(root,name))
print len(files)
A simple utility function I wrote that makes use of os.scandir()
instead of os.listdir()
.
import os
def count_files_in_dir(path: str) -> int:
file_entries = [entry for entry in os.scandir(path) if entry.is_file()]
return len(file_entries)
The main benefit is that, the need for os.path.is_file()
is eliminated and replaced with os.DirEntry
instance's is_file()
which also removes the need for os.path.join(DIR, file_name)
as shown in other answers.
Simpler one:
import os
number_of_files = len(os.listdir(directory))
print(number_of_files)
i did this and this returned the number of files in the folder(Attack_Data)...this works fine.
import os
def fcount(path):
#Counts the number of files in a directory
count = 0
for f in os.listdir(path):
if os.path.isfile(os.path.join(path, f)):
count += 1
return count
path = r"C:\Users\EE EKORO\Desktop\Attack_Data" #Read files in folder
print (fcount(path))
I solved this problem while calculating the number of files in a google drive directory through Google Colab by directing myself into the directory folder by
import os
%cd /content/drive/My Drive/
print(len([x for x in os.listdir('folder_name/']))
Normal user can try
import os
cd Desktop/Maheep/
print(len([x for x in os.listdir('folder_name/']))
Convert to list after that you can Len
len(list(glob.glob('*')))
I find that sometimes I don't know if I will receive filenames or the path to the file. So I printed the os walk solution output:
def count_number_of_raw_data_point_files(path: Union[str, Path], with_file_prefix: str) -> int:
import os
path: Path = force_expanduser(path)
_, _, files = next(os.walk(path))
# file_count = len(files)
filename: str
count: int = 0
for filename in files:
print(f'-->{filename=}') # e.g. print -->filename='data_point_99.json'
if with_file_prefix in filename:
count += 1
return count
out:
-->filename='data_point_780.json'
-->filename='data_point_781.json'
-->filename='data_point_782.json'
-->filename='data_point_783.json'
-->filename='data_point_784.json'
-->filename='data_point_785.json'
-->filename='data_point_786.json'
-->filename='data_point_787.json'
-->filename='data_point_788.json'
-->filename='data_point_789.json'
-->filename='data_point_79.json'
-->filename='data_point_790.json'
-->filename='data_point_791.json'
-->filename='data_point_792.json'
-->filename='data_point_793.json'
-->filename='data_point_794.json'
-->filename='data_point_795.json'
-->filename='data_point_796.json'
-->filename='data_point_797.json'
-->filename='data_point_798.json'
-->filename='data_point_799.json'
-->filename='data_point_8.json'
-->filename='data_point_80.json'
-->filename='data_point_800.json'
-->filename='data_point_801.json'
-->filename='data_point_802.json'
-->filename='data_point_803.json'
-->filename='data_point_804.json'
-->filename='data_point_805.json'
-->filename='data_point_806.json'
-->filename='data_point_807.json'
-->filename='data_point_808.json'
-->filename='data_point_809.json'
-->filename='data_point_81.json'
-->filename='data_point_810.json'
-->filename='data_point_811.json'
-->filename='data_point_812.json'
-->filename='data_point_813.json'
-->filename='data_point_814.json'
-->filename='data_point_815.json'
-->filename='data_point_816.json'
-->filename='data_point_817.json'
-->filename='data_point_818.json'
-->filename='data_point_819.json'
-->filename='data_point_82.json'
-->filename='data_point_820.json'
-->filename='data_point_821.json'
-->filename='data_point_822.json'
-->filename='data_point_823.json'
-->filename='data_point_824.json'
-->filename='data_point_825.json'
-->filename='data_point_826.json'
-->filename='data_point_827.json'
-->filename='data_point_828.json'
-->filename='data_point_829.json'
-->filename='data_point_83.json'
-->filename='data_point_830.json'
-->filename='data_point_831.json'
-->filename='data_point_832.json'
-->filename='data_point_833.json'
-->filename='data_point_834.json'
-->filename='data_point_835.json'
-->filename='data_point_836.json'
-->filename='data_point_837.json'
-->filename='data_point_838.json'
-->filename='data_point_839.json'
-->filename='data_point_84.json'
-->filename='data_point_840.json'
-->filename='data_point_841.json'
-->filename='data_point_842.json'
-->filename='data_point_843.json'
-->filename='data_point_844.json'
-->filename='data_point_845.json'
-->filename='data_point_846.json'
-->filename='data_point_847.json'
-->filename='data_point_848.json'
-->filename='data_point_849.json'
-->filename='data_point_85.json'
-->filename='data_point_850.json'
-->filename='data_point_851.json'
-->filename='data_point_852.json'
-->filename='data_point_853.json'
-->filename='data_point_86.json'
-->filename='data_point_87.json'
-->filename='data_point_88.json'
-->filename='data_point_89.json'
-->filename='data_point_9.json'
-->filename='data_point_90.json'
-->filename='data_point_91.json'
-->filename='data_point_92.json'
-->filename='data_point_93.json'
-->filename='data_point_94.json'
-->filename='data_point_95.json'
-->filename='data_point_96.json'
-->filename='data_point_97.json'
-->filename='data_point_98.json'
-->filename='data_point_99.json'
854
note you might have to sort.
I would like to extend the reply from @Mr_and_Mrs_D:
import os
folder = 'C:/Dropbox'
file_count = sum(len(files) for _, _, files in os.walk(folder))
print(file_count)
This counts all the files in the folder and its subfolders. However, if you want to do some filtering - like only counting the files ending in .svg
, you can do:
import os
file_count = sum(len([f for f in files if f.endswith('.svg')]) for _, _, files in os.walk(folder))
print(file_count)
You basically replace:
len(files)
with:
len([f for f in files if f.endswith('.svg')])
精彩评论