Splitting list into multiple lists to gain a speedup?

2023-03-27 02:16 问答作者：

Let's say my list is about 1,000,000 entries long. To acess an item, the time would be O(500,000), which seems very long to me.

What happens when I split the list up into multiple lists ? Let's look at an example:

Splitting the list into 10 parts, I'd have a list as follows:

splitted_list = [
    [list with 100,000 entries],
    [list with 100,000 entries],
    [list with 100,000 entries],
    [list with 100,000 entries],
    [list with 100,000 entries],
    [list with 100,000 entries],
    [list with 100,000 entries],
    [list with 100,000 entries],
    [list with 100,000 entries],
    [list with 100,000 entries]
]

The time to acess an item would be O(5) + O(50,000) = O(50,005) and gives a speedup about 1000% !

When spl开发者_如何学JAVAitting the original list about it's root, which is 1000 in this case, this would give us a list containg 1000 lists having another 1000 entries.

splitted_list = [
    [list with 1000 entries],
    [list with 1000 entries],
    [list with 1000 entries],
    [list with 1000 entries],
    ...
]

Now take a look at the time to acess an item:

O(500) + O(500) = O(1000)
O(1000) < O(50,005) < O(500,000)

This is the optimal speed up about 1000 times ! Incredible to belive, I think, so my question:

Does this apply in practice, too, or is this just theory ?

Get item by index from list is O(1) whatever the size of the list.

The answer to your question is that you are thinking of linked lists, in which each element holds a pointer to the next. These have O(n) indexing, since the only way to get the nth element is to walk through the list from the beginning.

Your idea is related to various data structures, of which the closest is probably a skip list. This is a data structure based on a linked list but with "highways" of nodes that skip ahead more than one element of the list. The advantage is that you can run down the highway to get to the middle of the list, and then drop down to the "slower lanes" when you need individual-element precision, giving O(log n) indexing efficiency -- the same as a binary tree. Of course, the downside is that it's more complicated (and slower) to perform other linked-list operations such as random inserts.

Python lists, however, are implemented under-the-hood as dynamically growing arrays. These have O(1) indexing, since to get the third element you can simply add three (units) to the memory address of the first element, without having to traverse all elements in between.

You may be interested in the Wikipedia article on data structures.

I am assuming that you are talking about finding an element in the lists.

If you are talking about splitting a sorted list into many sorted lists with pointers to their heads, congratulations, you have almost discovered the B-tree.

If those lists are really arrays (i.e. you have constant-time random access), you can also do a binary search.

继续阅读：list nested-lists performance python

Splitting list into multiple lists to gain a speedup?

Does this apply in practice, too, or is this just theory ?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Does this apply in practice, too, or is this just theory ?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？