Python - Human sort of numbers with alpha numeric, but in pyQt and a __lt__ operator [duplicate]
I have data rows and wish to have them presented as follows:
1
1a
1a2
2
3
9
9.9
10
10a
11
100
100ab
ab
aB
AB
As I am using pyQt and code is contained within a TreeWidgetItem, the code I'm trying to solve is:
def __lt__(self, otherItem):
column = self.treeWidget().sortColumn()
#return self.text(column).toLower() < otherItem.text(column).toLower()
orig = str(self.text(column).toLower()).rjust(20, "0")
other = str(otherItem.text(column).toLower()).rjust(20, "0")
return orig < other
This may help you. Edit the regexp to match the digit patterns you're interested in. Mine will treat any digit fields containing .
as floats. Uses swapcase()
to invert your case so that 'A'
sorts after 'a'
.
Updated: Refined:
import re
def _human_key(key):
parts = re.split('(\d*\.\d+|\d+)', key)
return tuple((e.swapcase() if i % 2 == 0 else float(e))
for i, e in enumerate(parts))
nums = ['9', 'aB', '1a2', '11', 'ab', '10', '2', '100ab', 'AB', '10a',
'1', '1a', '100', '9.9', '3']
nums.sort(key=_human_key)
print '\n'.join(nums)
Output:
1
1a
1a2
2
3
9
9.9
10
10a
11
100
100ab
ab
aB
AB
Update: (response to comment) If you have a class Foo
and want to implement __lt__
using the _human_key
sorting scheme, just return the result of _human_key(k1) < _human_key(k2)
;
class Foo(object):
def __init__(self, key):
self.key = key
def __lt__(self, obj):
return _human_key(self.key) < _human_key(obj.key)
>>> Foo('ab') < Foo('AB')
True
>>> Foo('AB') < Foo('AB')
False
So for your case, you'd do something like this:
def __lt__(self, other):
column = self.treeWidget().sortColumn()
k1 = self.text(column)
k2 = other.text(column)
return _human_key(k1) < _human_key(k2)
The other comparison operators (__eq__
, __gt__
, etc) would be implemented in the same way.
Using samplebias's swapcase
idea, and Ned Batchelder's human-sort code, you might do it this way:
import re
def human_keys(astr):
'''
alist.sort(key=human_keys) sorts in human order
'''
keys=[]
for elt in re.split('(\d+)', astr):
elt=elt.swapcase()
try: elt=int(elt)
except ValueError: pass
keys.append(elt)
return keys
x='''
1
1a
1a2
2
3
9
9.9
9.10
9a2
10
10a
11
100
100ab
ab
aB
AB
'''.split()
print(x)
assert x == sorted(x,key=human_keys)
You could apply human_keys
in __lt__
like this:
def __lt__(self, otherItem):
column = self.treeWidget().sortColumn()
orig = str(self.text(column).toLower()).rjust(20, "0")
other = str(otherItem.text(column).toLower()).rjust(20, "0")
return human_keys(orig) < human_keys(other)
I don't understand your sort algorithm, so I can't tell you how to implement it. But there is a general technique, which is to use the key
parameter in Python's builtin sort
function. In other words, you want to come up with some transformation of your data which Python would sort in the correct order, and then write that transformation as a Python function foo
and call sort(data, key=foo)
.
Example: if you had a list of strings "<integer>-<integer>"
, say ["1-1","1-2","3-1"]
and you wanted to sort by the second number and then the first, notice that Python would sort the data correctly if it were in the form [(1,1), (2,1), (1,3)]
i.e. a list of reversed tuples. So you would write a function
def key(s):
l, r = s.split("-")
return int(r), int(l)
and then sort the list with sort(l, key=key)
.
Here's a function that, given a string with a mixture of alphabetical and numeric parts, returns a tuple that will sort in a "natural" way.
def naturalkey(key, convert=int):
if not key:
return ()
keys = []
start = 0
extra = ""
in_num = key[0].isdigit()
for i, char in enumerate(key):
if start < i:
if in_num:
try:
last_num = convert(key[start:i])
except:
in_num = False
if i > 2 and key[i-2] == ".":
extra = "."
keys.append(last_num)
start = i-1
if not in_num: # this is NOT equivalent to `else`!
if char.isdigit():
keys.append(extra + key[start:i])
in_num = True
start = i
extra = ""
last_num = convert(char)
keys.append(last_num if in_num else (extra + key[start:]))
return tuple(keys)
The basic approach it uses is, when it sees a digit, it gathers additional characters and keeps trying to convert the result to a number until it can't anymore (i.e. it gets an exception). By default it tries to convert runs of characters to an integer, but you can pass in convert=float
to have it accept decimal points. (It won't accept scientific notation, unfortunately, since to get something like '1e3' it would first try to parse '1e' which is invalid. This, along with the + or - sign, could be special-cased but it doesn't look like that is necessary for your use case.)
The function returns a tuple containing strings and numbers in the order they were found in the string, with the numbers parsed to the specified numeric type. For example:
naturalkey("foobar2000.exe")
>>> ("foobar", 2000, ".exe")
This tuple can be used as a key for sorting a list of strings:
my_list.sort(key=lambda i: naturalkey(i, float))
Or you can use it to implement a comparison function:
def __lt__(self, other):
return naturalkey(self.value, float) < naturalkey(other.value, float)
It would be better (faster) to generate the natural key in the object's __init__()
method, store it in the instance, and write your comparison function(s) to use the stored value instead. If the value from which the key is derived is mutable, you could write a property that updates the key when the underlying value is updated.
精彩评论