how to sort a list by the nth element in v2.3?
This is a simple script I wrote:
#!/usr/bin/env python
file = open('readFile.txt', 'r')
lines = file.readlines()
file.close()
del file
sortedList = sorted(lines, key=lambda lines: lines.split('\t')[-2])
file = open('outfile.txt', 'w')
for line in sortedList:
file.write(line)
file.close()
del file
to rewrite a file like this:
161788 group_monitor.sgmops 4530 1293840320 1293840152
161789 group_atlas.atlas053 22350 1293840262 1293840152
161790 group_alice.alice017 210 1293840254 1293840159
161791 group_lhcb.pltlhc15 108277 1293949235 1293840159
161792 group_atlas.sgmatlas 35349 1293840251 1293840160
(where the last two fields are epoch time) ordered by the next to last field to this:
161792 group_atlas.sgmatlas 35349 1293840251 1293840160
161790 group_alice.alice017 210 1293840254 1293840159
161789 group_atlas.atlas053 22350 1293840262 1293840152
161788 group_monitor.sgmops 4530 1293840320 1293840152
161791 group_lhcb.pltlhc15 108277 1293949235 1293840159
As you can see, I used sorted()
, which was introduced in v2.4, how can I rewrite the script for v2.3, so that it does that same thing.
In addition, I want to convert the epoch time to the human-readable format, so the resultant file looks like this:
161792 group_atlas.sgmatlas 35349 01/01/11 00:04:11 01/01/11 00:02:40
161790 group_alice.alice017 210 01/01/11 00:04:14 01/01/11 00:02:39
161789 group_atlas.atlas053 22350 01/01/11 00:04:22 01/01/11 00:02:32
I know, this strftime("%d/%m/%y %H:%M:%S", gmtime())
can be used to convert the epoch time but I just can't figure out how can I apply that to the script to rewrite the file in that format.
Comments? Advice treasured!
@Mark: Update
In some cases, the epoch time comes as 3600
, which is to indicate an unfinished business. I wanted to print aborted instead of 01/01/1970
for such a line. So, I changed the format_seconds_since_epoch()
like this:
def format_seconds_since_epoch(t):
if t == 3600:
return "开发者_如何学Goaborted"
else:
return strftime("%d/%m/%y %H:%M:%S",datetime.fromtimestamp(t).timetuple())
which solved the problem. Is it the best that can be done in this regard? Cheers!!
file = open('readFile.txt', 'r')
lines = file.readlines()
file.close()
del file
lines = [line.split(' ') for line in lines]
lines.sort(lambda x,y: cmp(x[2], y[2])
lines = [' '.join(line) for line in lines]
In reply to your final query, you can create a datetime
object from a time_t
-like "seconds since the epoch" value using datetime.fromtimestamp
, e.g.
from datetime import datetime
from time import strftime
def format_seconds_since_epoch(t):
return strftime("%d/%m/%y %H:%M:%S",datetime.fromtimestamp(t).timetuple())
print format_seconds_since_epoch(1293840160)
So, putting that together with a slightly modified version of pynator's answer, you script might look like:
#!/usr/bin/env python
from datetime import datetime
from time import strftime
import os
def format_seconds_since_epoch(t):
return strftime("%d/%m/%y %H:%M:%S",datetime.fromtimestamp(t).timetuple())
fin = open('readFile.txt', 'r')
lines = fin.readlines()
fin.close()
del fin
split_lines = [ line.split("\t") for line in lines ]
split_lines.sort( lambda a, b: cmp(int(a[-2]),int(b[-2])) )
fout = open('outfile.txt', 'w')
for split_line in split_lines:
for i in (-2,-1):
split_line[i] = format_seconds_since_epoch(int(split_line[i]))
fout.write("\t".join(split_line)+os.linesep)
fout.close()
del fout
Note that using file
as a variable name is a bad idea, since it shadows the built-in file
type, so I changed them to fin
and fout
. (Even though you are del
ing the variables afterwards, it's still good style to avoid the name file
, I think.)
In reply to your further question about the special "3600" value, your solution is fine. Personally, I would probably keep the format_seconds_since_epoch
function as it is, so that it doesn't have a surprising special case and is more generally useful. You could create an additional wrapper function with the special case, or just change the split_line[i] = format_seconds_since_epoch(int(split_line[i]))
line to:
entry = int(split_line[i])
if entry == 3600:
split_line[i] = "aborted"
else:
split_line[i] = format_seconds_since_epoch(entry)
... however I don't think there's much in the difference.
Incidentally, if this is more than a one-off task, I would see if you can use a later version of Python in the 2 series than 2.3, which is rather old now - they have lots of nice features that help one to write cleaner scripts.
精彩评论