Can I count on order being preserved in a Python tuple?

2023-01-28 21:15 问答作者：

I've got a list of datetimes from which I want to construct tim开发者_开发知识库e segments. In other words, turn [t0, t1, ... tn] into [(t0,t1),(t1,t2),...,(tn-1, tn)]. I've done it this way:

# start by sorting list of datetimes
mdtimes.sort()
# construct tuples which represent possible start and end dates

# left edges
dtg0 = [x for x in mdtimes]
dtg0.pop()

# right edges
dtg1 = [x for x in mdtimes]
dtg1.reverse()
dtg1.pop()
dtg1.sort()

dtsegs = zip(dtg0,dtg1)

Questions...

Can I count on tn-1 < tn for any (tn-1,tn) after I've created them this way? (Is ordering preserved?)
Is it good practice to copy the original mdtimes list with list comprehensions? If not how should it be done?
The purpose for constructing these tuples is to iterate over them and segment a data set with tn-1 and tn. Is this a reasonable approach? i.e.
```
datasegment = [x for x in bigdata if ( (x['datetime'] > tleft) and (x['datetime'] < tright))] 
```

Thanks

Tuple order is as you insert values into the tuple. They're not going to be sorted as I think you're asking. zip will again, retain the order you inserted the values in.
It's an acceptable method, but I have 2 alternate suggestions: Use the copy module, or use dtg1 = mdtimes[:].
Sounds reasonable.

Both list and tuple are ordered.

dtg0, dtg1 = itertools.tee(mdtimes)
next(dtg0)
dtsegs = zip(dtg0, dtg1)

You can achieve the same with zip:

>>> l = ["t0", "t1", "t2", "t3", "t4", "t5", "t6"]
>>> zip(l[::2], l[1::2])
[('t0', 't1'), ('t2', 't3'), ('t4', 't5')]

Instead of: dtg0 = [x for x in mdtimes], dtg0 = mdtimes[:] would do, since you just copy one list into another. Note: starting with Python 3.3, you can just say newlist = oldlist.copy()

As for order, zip's order is well defined, and both lists and tuples are ordered collections, so you should have no problem here.

Turning (x1, x2, x3, ...) into [(x1, x2), (x2, x3), ...] is called a pairwise combination, and it's so common a pattern that the itertools documentation provides a recipe:

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    next(b, None)
    return izip(a, b)

for ta, tb in pairwise(mdtimes): 
    ....

This is an answer to the question "Is this a reasonable approach?" (which appears to have been ignored by all).

Summary: You may want/need to lift your gaze from making a pairwise thingy out of mdtimes to the encompassing problem (segmenting bigdata).

Detail:

The desired use of the result is expressed as:

datasegment = [x for x in bigdata if ( (x['datetime'] > tleft) and (x['datetime'] < tright))]

which is better expressed as:

datasegment = [x for x in bigdata if tleft < x['datetime'] < tright]

Note that as that stands, it will not include any cases where the timestamp is exactly equal to one of the boundary points, so let's change it to:

datasegment = [x for x in bigdata if tleft <= x['datetime'] < tright]

But that's going to appear in a loop:

for tleft, tright in dtsegs:
    datasegment = [x for x in bigdata if tleft <= x['datetime'] < tright]
    do_something_with(datasegment)

Whoops! That's going to take time proportional to len(bigdata) * len(dtsegs) ... what are likely values of len(bigdata) and len(dtsegs)?

If bigdata is sorted, what you want to do can be done in time proportional to N, where N = len(bigdata). If bigdata is not already sorted, it can be sorted in time proportional to N * log(N).

You might like to ask another question ...

It's also worth pointing out that any items in bigdata that have a timestamp < min(mdtimes) or >= max(mdtimes) will not be included in any data segment ... is this intentional?

I'm no expert, but aren't you quadrupling your memory requirements by copying the list and then making a new list of pairs taken from two lists? Why not just do the following:

dtsegs = [(dtg0[i], dtg0[i+1]) for i in range(len(dtg0)-1)]

Dunno how "Pythonic" that is, though.

EDIT: Actually, looking at what you need to do with this list of tuples, you could just do this [i] and [i+1] stuff directly at that level and not even create this new structure at all. I don't know how many dates you're dealing with, though - if it's some small number I suppose it doesn't really matter.

For what it's worth, a couple of the other answerers here seem to be misunderstanding your question, though I can't comment on their posts since I don't have enough reputation yet :) Ignacio Vazquez-Abrams's solution seems the best to me, though his "next(dtg0)" should probably be "next(dtg1)" (?)

继续阅读：list python tuples

Can I count on order being preserved in a Python tuple?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？