Order a list of files by size via python
Example dump from the list of a director开发者_JAVA技巧y:
hello:3.1 GB
world:1.2 MB
foo:956.2 KB
The above list is in the format of FILE:VALUE UNIT. How would one go about ordering each line above according to file size?
I thought perhaps to parse each line for the unit via the pattern ":VALUE UNIT" (or somehow use the delimiter) then run it through the ConvertAll engine, receive the size off each value in bytes, hash it with the rest of the line (filenames), then order the resulting dictionary pairs via size.
Trouble is, I have no idea about pattern matching. But I see that you can sort a dictionary
If there is a better direction in which to solve this problem, please let me know.
EDIT:
The list that I had was actually in a file. Taking inspiration from answer of the (awesome) Alex Martelli, I've written up the following code that extracts from one file, orders it and writes to another.
#!/usr/bin/env python
sourceFile = open("SOURCE_FILE_HERE", "r")
allLines = sourceFile.readlines()
sourceFile.close()
print "Reading the entire file into a list."
cleanLines = []
for line in allLines:
cleanLines.append(line.rstrip())
mult = dict(KB=2**10, MB=2**20, GB=2**30)
def getsize(aline):
fn, size = aline.split(':', 1)
value, unit = size.split(' ')
multiplier = mult[unit]
return float(value) * multiplier
print "Writing sorted list to file."
cleanLines.sort(key=getsize)
writeLines = open("WRITE_OUT_FILE_HERE",'a')
for line in cleanLines:
writeLines.write(line+"\n")
writeLines.close()
thelines = ['hello:3.1 GB', 'world:1.2 MB', 'foo:956.2 KB']
mult = dict(KB=2**10, MB=2**20, GB=2**30)
def getsize(aline):
fn, size = aline.split(':', 1)
value, unit = size.split(' ')
multiplier = mult[unit]
return float(value) * multiplier
thelines.sort(key=getsize)
print thelines
emits ['foo:956.2 KB', 'world:1.2 MB', 'hello:3.1 GB']
, as desired. You may have to add some entries to mult
if KB, MB and GB don't exhaust your set of units of interest of course.
精彩评论