开发者

Why don't F# lists have a tail pointer

Or phrased another way, what kind of benefits do you get from having a basic, singly linked list with only a head pointer? The benefits of a tail pointer that I can see are:

  • O(1) list concatenation
  • O(1) Appending stuff to the right side of the list

Both of which are rather convenient things to have, as opposed to O(n) list concatenation (where n is the length of the left-side list?).开发者_JAVA技巧 What advantages does dropping the tail pointer have?


F#, like many other functional[-ish] languages, has a cons-list (the terminology originally comes from LISP, but the concept is the same). In F# the :: operator (or List.Cons) is used for cons'ing: note the signature is ‘a –> ‘a list –> ‘a list (see Mastering F# Lists).

Do not confuse a cons-list with an opaque Linked List implementation which contains a discrete first[/last] node - every cell in a cons-list is the start of a [different] list! That is, a "list" is simply the chain of cells that starts at a given cons-cell.

This offers some advantages when used in a functional-like manner: one is that all the "tail" cells are shared and since each cons-cell is immutable (the "data" might be mutable, but that's a different issue) there is no way to make a change to a "tail" cell and flub up all the other lists which contain said cell.

Because of this property, [new] lists can be efficiently built - that is, they do not require a copy - simply by cons'ing to the front. In addition, it is also very efficient to deconstruct a list to head :: tail - once again, no copy - which is often very useful in recursive functions.

This immutable property generally does not exist in a [standard mutable] Double Linked List implementation in that appending would add side-effects: the internal 'last' node (the type is now opaque) and one of the "tail" cells are changed. (There are immutable sequence types that allow an "effectively constant time" append/update such as immutable.Vector in Scala -- however, these are heavy-weight objects compared to a cons-list that is nothing more than a series of cells cons'ed together.)

As mentioned, there are also disadvantages a cons-list is not appropriate for all tasks - in particular, creating a new list except by cons'ing to the head is an O(n) operation, fsvo n, and for better (or worse) the list is immutable.

I would recommend creating your own version of concat to see how this operation is really done. (The article Why I love F#: Lists - The Basics covers this.)

Happy coding.


Also see related post: Why can you only prepend to lists in functional languages?


F# lists are immutable, there's no such thing as "append/concat", rather there's just creating new lists (that may reuse some suffixes of old lists). Immutability has many advantages, outside the scope of this question. (All pure languages, and most functional languages have this data structure, it is not an F#-ism.)

See also

http://diditwith.net/2008/03/03/WhyILoveFListsTheBasics.aspx

which has good picture diagrams to explain things (e.g. why consing on the front is cheaper than at the end for an immutable list).


In addition to what the others said: if you need efficient, but yet immutable data structures (which should be an idiomatic F# way), you have to consider reading Chris Okasaki, Purely Functional Data Structures. There is also a thesis available (on which the book is based).


In addition to what has been already said, the Introducing Functional Programming section on MSDN has an article about Working with Functional Lists that explains how lists work and also implements them in C#, so it may be a good way to understand how they work (and why adding reference to the last element would not allow efficient implementation of append).

If you need to append things to the end of the list, as well as to the front, then you need a different data structure. For example, Norman Ramsey posted source code for DList which has these properties here (The implementation is not idiomatic F#, but it should be easy to fix).


If you find you want a list with better performance for append operations, have a look at the QueueList in the F# PowerPack and the JoinList in the FSharpx extension libraries.

QueueList encapsulates two lists. When you prepend using the cons, it prepends an element to the first list, just as a cons-list. However, if you want to append a single element, it can be pushed to the top of the second list. When the first list runs out of elements, List.rev is run on the second list, and the two are swapped putting your list back in order and freeing the second list to append new elements.

JoinList uses a discriminated union to more efficiently append whole lists and is a bit more involved.

Both are obviously less performant for standard cons-list operations but offer better performance for other scenarios.

You can read more about these structures in the article Refactoring Pattern Matching.


As others have pointed out, an F# list could be represented by a data structure:

List<T> { T Value; List<T> Tail; }

From here, the convention is that a list goes from the List you have a reference to until Tail is null. Based on that definition, the benefits/features/limitations in the other answers come naturally.

However, your original question seems to be why the list is not defined more like:

List<T> { Node<T> Head; Node<T> Tail; }
Node<T> { T Value; Node<T> Next; }

Such a structure would allow both appending and prepending to the list without any visible effects to the a reference to the original list, since it still only sees a "window" of the now expanded list. Although this would (sometimes) allow O(1) concatenation, there are several issues such a feature would face:

  • The concatenation only works once. This can lead to unexpected performance behavior where one concatenation is O(1), but the next is O(n). Say for example:

     listA = makeList1 ()
     listB = makeList2 ()
     listC = makeList3 ()
     listD = listA + listB //modified Node at tail of A for O(1)
     listE = listA + listC //must now make copy of A to concat with C
    

    You could argue that the time savings for the cases where possible are worth it, but the surprise of not knowing when it will be O(1) and when O(n) are strong arguments against the feature.

  • All lists now take up twice as much space, even if you never plan to concatenate them.
  • You now have a separate List and Node type. In the current implementation, I believe F# only uses a single type like the beginning of my answer. There may be a way to do what you are suggesting with only one type, but it is not obvious to me.
  • The concatenation requires mutating the original "tail" node instance. While this shouldn't affect programs, it is a point of mutation, which most functional languages tend to avoid.


Or phrased another way, what kind of benefits do you get from having a basic, singly linked list with only a head pointer? The benefits of a tail pointer that I can see are:

  • O(1) list concatenation
  • O(1) Appending stuff to the right side of the list

Both of which are rather convenient things to have, as opposed to O(n) list concatenation (where n is the length of the left-side list?).

If by "tail pointer" you mean a pointer from every list to the last element in the list, that alone cannot be used to provide either of the benefits you cite. Although you could then get at the last element in the list quickly, you cannot do anything with it because it is immutable.

You could write a mutable doubly-linked list as you say but the mutability would make programs using it significantly harder to reason about because every function you call with one might change it.

As Brian said, there are purely functional catenable lists. However, they are many times slower at common operations than the simple singly-linked list that F# uses.

What advantages does dropping the tail pointer have?

30% less space usage and better performance on virtually all list operations.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜