开发者

How to quickly sort dict() with a huge number of keys?

TLE always happen in SBANK SPOJ 开发者_如何学编程using python. In order to solve it, i have to sort dict(), though dict() has huge number of KEYS(maximum--100000). Use sorted() function in my code takes no effect. Is there any fast solution? Thanks for your help.

My code below:

for j in range(n): # n is the number of keys
        account = sys.stdin.readline().rstrip()
        dic.setdefault(account, 0)
        dic[account] += 1
sorted(dic) # **this sort take a lot of time**

EDIT1:According to Justin Peel's tips, i update my code below, but return still TLE. How can i do?

import sys
import psyco # import psyco module to speed up
psyco.full()
nCase = int(sys.stdin.readline().split()[0])
for i in range(nCase):
    n = int(sys.stdin.readline().split()[0])
    dic = dict()
    lst = list()
    for j in range(n):
        account = sys.stdin.readline().rstrip()
        dic.setdefault(account, 0)
        dic[account] += 1
    sys.stdin.readline()
    lst = dic.keys() # store keys in list
    lst.sort()
    for account in lst:
        sys.stdout.write('%s %s\n' % (account, dic[account]))


dicts are not sorted, which is how they are able to provide O(1) insert and get access. (Internally, they are implemented as hash tables, I believe, though I'm not sure this is required by the Python spec).

If you want to iterate the keys of a dict in sorted order, you can use:

for key in sorted(the_dict.iterkeys()):
    value = the_dict[key]
    # do something

But, as you note, sorting 100,000 elements may take some time.

As an alternative, you can write (or find on the internet) sorted dict implementations that keep an ordered list of keys along with the dictionary, and support fast lookup by key, and iteration in order without having to sort all at once. Of course, to support sorted order, the keys will need to be sorted at insertion time, so inserts will not be O(1).

Edit: Per dsolimano's comment, if you are using Python 2.7 or Python 3.x, there is a built-in OrderedDict class that orders iteration in insertion order. This keeps insertion fast, but may not do what you need (depending on the order of items you want).


I was able to solve this problem. Here are some tips:

  1. Use Python 2.5. It is much faster than Python 3.2 which is the other option available on SPOJ with Python. Only one person has been able to get a fast enough solution using Python 3.2
  2. Just use a basic dict for counting. You can get by with the defaultdict from the collections module too, but the basic dict was faster for me.
  3. Sort only the keys of the dict, not the key-item pairs. Forming the key-item pairs takes far too long. Also, use keys = mydict.keys(); keys.sort() because it is the fastest way to go about it.
  4. Use psyco (pretty much always with SPOJ problems in Python)
  5. Learn the fastest ways to do input and output in Python. Hint: it isn't iterating over every line of input for example.
  6. Try submitting after you've added each part (getting input, counting, doing output) to see where you are with time. This is a very valuable thing to do on SPOJ. The SPOJ computer running your code is quite likely much slower than your current computer and it can be hard to determine based on your own computer's running time of the code if it will be fast enough for SPOJ.


Since Python 3.1 is available, collections.Counter is good for that purpose:

collections.Counter(map(str.rstrip, sys.stdin)).most_common()
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜