开发者

Grouping a series in Python

Title edit: capitalization fixed and 'for python' added.

Is there a better or more standard way to do what I'm describing? I want input like this:

[1, 1, 1, 0, 2, 2, 0, 2, 2, 0, 0, 3, 3, 0, 1, 1, 1, 1, 1, 2, 2, 2]

to be transformed to this:

[0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 3, 0, 0, 0, 1, 0, 0, 0, 2, 0]

or, even better, something like this (describing similar output differently, but now not limited to integers):

labels: [1, 2, 3, 1, 2]

positions(where 1 identified the first occupiable position, as per my matplotlib plot): [2, 7, 12.5, 17, 21]

The input data is categorical data that classified a plot - in the picture below, grouped plots share a categorical feature which I'd like to label only once for the group. I'll be using开发者_开发问答 2 axes for two different variables, but I think that's besides the point for now.

Note: This image does not reflect either set of sample data - it's just to get across the idea of grouping together categories. Group a should be labeled at x=5, since there's a blank space between the first two and second to vertical data groups, and 0 is the line on the right side.

Grouping a series in Python

Here's what I've got:

data = [1, 1, 1, 2, 2, 2, 2, 2, 3, 4, 3, 2, 2, 1, 1, 1, 1]
last = None
runs = []
labels = []
run = 1
for x in data:
    if x in (last, 0):
        run += 1
    else:
        runs.append(run)
        run = 1
        labels.append(x)
    last = x
runs.append(run)
runs.pop(0)
labels.append(x)
tick_positions = [0]
last_run = 1
for run in runs:
    tick_positions.append(run/2.0+last_run/2.0+tick_positions[-1])
    last_run = run
tick_positions.pop(0)
print tick_positions


To get the labels you can use itertools groupby:

>>> import itertools
>>> numbers = [1, 1, 1, 0, 2, 2, 0, 2, 2, 0, 0, 3, 3, 0, 1, 1, 1, 1, 1, 2, 2, 2]
>>> list(k for k, g in itertools.groupby(numbers))
[1, 0, 2, 0, 2, 0, 3, 0, 1, 2]

And to remove the zeros you can use a comprehension:

>>> list(k for k, g in itertools.groupby(x for x in numbers if x != 0))
[1, 2, 3, 1, 2]

If you want to get the positions too, then you'll have to iterate through the list yourself as you are already doing. groupby doesn't keep track of that for you.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜