Grouping a series in Python
Title edit: capitalization fixed and 'for python' added.
Is there a better or more standard way to do what I'm describing? I want input like this:
[1, 1, 1, 0, 2, 2, 0, 2, 2, 0, 0, 3, 3, 0, 1, 1, 1, 1, 1, 2, 2, 2]
to be transformed to this:
[0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 3, 0, 0, 0, 1, 0, 0, 0, 2, 0]
or, even better, something like this (describing similar output differently, but now not limited to integers):
labels: [1, 2, 3, 1, 2]
positions(where 1 identified the first occupiable position, as per my matplotlib plot): [2, 7, 12.5, 17, 21]
The input data is categorical data that classified a plot - in the picture below, grouped plots share a categorical feature which I'd like to label only once for the group. I'll be using开发者_开发问答 2 axes for two different variables, but I think that's besides the point for now.
Note: This image does not reflect either set of sample data - it's just to get across the idea of grouping together categories. Group a should be labeled at x=5, since there's a blank space between the first two and second to vertical data groups, and 0 is the line on the right side.

Here's what I've got:
data = [1, 1, 1, 2, 2, 2, 2, 2, 3, 4, 3, 2, 2, 1, 1, 1, 1]
last = None
runs = []
labels = []
run = 1
for x in data:
if x in (last, 0):
run += 1
else:
runs.append(run)
run = 1
labels.append(x)
last = x
runs.append(run)
runs.pop(0)
labels.append(x)
tick_positions = [0]
last_run = 1
for run in runs:
tick_positions.append(run/2.0+last_run/2.0+tick_positions[-1])
last_run = run
tick_positions.pop(0)
print tick_positions
To get the labels you can use itertools groupby:
>>> import itertools
>>> numbers = [1, 1, 1, 0, 2, 2, 0, 2, 2, 0, 0, 3, 3, 0, 1, 1, 1, 1, 1, 2, 2, 2]
>>> list(k for k, g in itertools.groupby(numbers))
[1, 0, 2, 0, 2, 0, 3, 0, 1, 2]
And to remove the zeros you can use a comprehension:
>>> list(k for k, g in itertools.groupby(x for x in numbers if x != 0))
[1, 2, 3, 1, 2]
If you want to get the positions too, then you'll have to iterate through the list yourself as you are already doing. groupby doesn't keep track of that for you.
加载中,请稍侯......
精彩评论