开发者

how to sort a list by the nth element in v2.3?

This is a simple script I wrote:

#!/usr/bin/env python

file = open('readFile.txt', 'r')
lines = file.readlines()
file.close()
del file

sortedList = sorted(lines, key=lambda lines: lines.split('\t')[-2])

file = open('outfile.txt', 'w')
for line in sortedList:
    file.write(line)

file.close()
del file

to rewrite a file like this:

161788  group_monitor.sgmops    4530    1293840320  1293840152
161789  group_atlas.atlas053    22350   1293840262  1293840152
161790  group_alice.alice017    210     1293840254  1293840159
161791  group_lhcb.pltlhc15     108277  1293949235  1293840159
161792  group_atlas.sgmatlas    35349   1293840251  1293840160

(where the last two fields are epoch time) ordered by the next to last field to this:

161792  group_atlas.sgmatlas    35349   1293840251  1293840160
161790  group_alice.alice017    210     1293840254  1293840159
161789  group_atlas.atlas053    22350   1293840262  1293840152
161788  group_monitor.sgmops    4530    1293840320  1293840152
161791  group_lhcb.pltlhc15     108277  1293949235  1293840159

As you can see, I used sorted(), which was introduced in v2.4, how can I rewrite the script for v2.3, so that it does that same thing. In addition, I want to convert the epoch time to the human-readable format, so the resultant file looks like this:

161792  group_atlas.sgmatlas    35349   01/01/11 00:04:11   01/01/11 00:02:40
161790  group_alice.alice017    210     01/01/11 00:04:14   01/01/11 00:02:39
161789  group_atlas.atlas053    22350   01/01/11 00:04:22   01/01/11 00:02:32

I know, this strftime("%d/%m/%y %H:%M:%S", gmtime()) can be used to convert the epoch time but I just can't figure out how can I apply that to the script to rewrite the file in that format.

Comments? Advice treasured!


@Mark: Update

In some cases, the epoch time comes as 3600, which is to indicate an unfinished business. I wanted to print aborted instead of 01/01/1970 for such a line. So, I changed the format_seconds_since_epoch() like this:

def format_seconds_since_epoch(t):
    if t == 3600:
        return "开发者_如何学Goaborted"
    else:
        return strftime("%d/%m/%y %H:%M:%S",datetime.fromtimestamp(t).timetuple())

which solved the problem. Is it the best that can be done in this regard? Cheers!!


file = open('readFile.txt', 'r')
lines = file.readlines()
file.close()
del file

lines = [line.split(' ') for line in lines]
lines.sort(lambda x,y: cmp(x[2], y[2])
lines = [' '.join(line) for line in lines]


In reply to your final query, you can create a datetime object from a time_t-like "seconds since the epoch" value using datetime.fromtimestamp, e.g.

from datetime import datetime
from time import strftime

def format_seconds_since_epoch(t):
    return strftime("%d/%m/%y %H:%M:%S",datetime.fromtimestamp(t).timetuple())

print format_seconds_since_epoch(1293840160)

So, putting that together with a slightly modified version of pynator's answer, you script might look like:

#!/usr/bin/env python

from datetime import datetime
from time import strftime
import os

def format_seconds_since_epoch(t):
    return strftime("%d/%m/%y %H:%M:%S",datetime.fromtimestamp(t).timetuple())

fin = open('readFile.txt', 'r')
lines = fin.readlines()
fin.close()
del fin

split_lines = [ line.split("\t") for line in lines ]

split_lines.sort( lambda a, b: cmp(int(a[-2]),int(b[-2])) )

fout = open('outfile.txt', 'w')
for split_line in split_lines:
    for i in (-2,-1):
        split_line[i] = format_seconds_since_epoch(int(split_line[i]))
    fout.write("\t".join(split_line)+os.linesep)

fout.close()
del fout

Note that using file as a variable name is a bad idea, since it shadows the built-in file type, so I changed them to fin and fout. (Even though you are deling the variables afterwards, it's still good style to avoid the name file, I think.)

In reply to your further question about the special "3600" value, your solution is fine. Personally, I would probably keep the format_seconds_since_epoch function as it is, so that it doesn't have a surprising special case and is more generally useful. You could create an additional wrapper function with the special case, or just change the split_line[i] = format_seconds_since_epoch(int(split_line[i])) line to:

entry = int(split_line[i])
if entry == 3600:
    split_line[i] = "aborted"
else:
    split_line[i] = format_seconds_since_epoch(entry)

... however I don't think there's much in the difference.

Incidentally, if this is more than a one-off task, I would see if you can use a later version of Python in the 2 series than 2.3, which is rather old now - they have lots of nice features that help one to write cleaner scripts.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜