开发者

Trees: Linked Lists vs Arrays (Efficiency)

This is an assignment question that I am having trouble wording an answer to.

"Suppose a tree may have up to k children per node. Let v be the average number of children per node. For what value(s) of v is it more efficient (in terms of space used) to store the child nodes in a linked list versus storage in an array? Why?"

I believe I can answer the "why?" more or less in plain English -- it will be more efficient to use the linked list because rather than having a bunch of empty nodes (ie empty indexes in the array if your average is lower than the max) taking up memory you 开发者_运维问答only alloc space for a node in a linked list when you're actually filling in a value.

So if you've got an average of 6 children when your maximum is 200, the array will be creating space for all 200 children of each node when the tree is created, but the linked list will only alloc space for the nodes as needed. So, with the linked list, space used will be approximately(?) the average; with the array, spaced used will be the max.

...I don't see when it would ever be more efficient to use the array. Is this a trick question? Do I have to take into account the fact that the array needs to have a limit on total number of nodes when it's created?


For many commonly used languages, the array will require allocating storage k memory addresses (of the data). A singly-linked list will require 2 addresses per node (data & next). A doubly-linked list would require 3 addresses per node.

Let n be the actual number of children of a particular node A:

  • The array uses k memory addresses
  • The singly-linked list uses 2n addresses
  • The doubly-linked list uses 3n addresses

The value k allows you to determine if 2n or 3n addresses will average to a gain or loss compared to simply storing the addresses in an array.


...I don't see when it would ever be more efficient to use the array. Is this a trick question?

It’s not a trick question. Think of the memory overhead that a linked list has. How is a linked list implemented (vs. an array)?

Also (though this is beyond the scope of the question!), space consumption isn’t the only deciding factor in practice. Caching plays an important role in modern CPUs and storing the individual child nodes in an array instead of a linked list can improve the cache locality (and consequently the tree’s performance) drastically.


Arrays must pre-allocate space but we can use them to access very fast any entry.
Lists allocate memory whenever they create a new node, and that isn't ideal because
memory allocation costs on cpu time.

Tip: you can allocate the whole array at once, if you want to, but usually we allocate,
lets say, 4 entries and we resize it by doubling the size when we need more space.


I can imagine it could be a very good idea and very efficient in many scenarios to use a LinkedList if the data item one uses has intrinsic logic for a previous and next item anyway, for example a TimeTableItem or anything that is somehow time-related. These should implement an interface, so the LinkedList implementation can leverage that and doesn't have to wrap the items into its own node objects. Inserting and removing would be much more efficient here than using a List implementation which internally juggles arrays around.


You're assuming the array can't be dynamically re-allocated. If it can, the array wins, because it need be no bigger than k items (plus a constant overhead), and because it doesn't need per-item storage for pointers.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜