开发者

What is the benefit of purely functional data structure?

There are large number of texts on data structures, and libraries of data structures code开发者_JS百科. I understand that purely functional data structure is easier to reason about. However I have trouble to understand the real world advantage of using purely functional data structure in pragmatic code (using functional programming language or not) over the imperative counterpart. Can somebody provide some real world cases where purely functional data structure has advantage and why?

Examples along the line like I use data_structure_name in programming_language to do application because it can do certain_thing.

Thanks.

PS: What I mean by purely functional data structure is not the same as persistent data structure. Persistent data structure is a data structure that doesn't change?? On other hand purely functional data structure is a data structure that operates purely.


Purely functional (aka persistent or immutable) data structures give you several advantages:

  • you never have to lock them, which extremely improves concurrency.
  • they can share structure, which reduces memory usage. For example, consider list [1, 2, 3, 4] in Haskell and some imperative language like Java. To produce new list in Haskell you only have to create new cons (pair of value and reference-to-next-element) and connect it to the previous list. In Java you have to create completely new list not to damage the previous one.
  • you can make persistent data structures lazy.
  • also, if you use functional style, you can avoid thinking of time and sequence of operations, and so, make your programs more declarative.
  • fact, that the data structure is immutable, allows you to make some more assumptions and so expand capabilities of language. For example, Clojure uses the fact of immutability to correctly provide implementations of hashCode() method on each object, so any object may be used as a key in a map.
  • with immutable data and functional style you can also freely use memoization.

There's much more advantages, in general, it is another way of modeling the real world. This and some other chapters from SICP will give you more accurate view of programming with immutable structures, its advantages and disadvantages.


In addition to shared memory safety most purely function data structures also give you persistence, and practically for free. For example, let's say I have a set in OCaml, and I want to add some new values to it I can do this:

module CharSet = Set.Make(Char)
let a = List.fold_right CharSet.add ['a';'b';'c';'d'] CharSet.empty in
let b = List.fold_right CharSet.add ['e';'f';'g';'h'] a in
...

a remains unmodified after adding the new characters (it only contains a-d), while b contains a-h, and they share some of the same memory (with a set it's kind of tricky to tell how much memory is shared since it's an AVL tree and the shape of the tree changes). I can continue doing this, keeping track of all the changes I've made to the tree allowing me to go back to a previous state.

Here's a great diagram from the Wikipedia article on Purely Functional that shows the results of insert the character 'e' into the binary tree xs:

What is the benefit of purely functional data structure?


Erlang programs use purely functional data structures almost exclusively, and they reap substantial benefits by scaling almost seamlessly to multiple cores. Because shared data (mainly binaries and bit strings) is never modified, there is never a need to lock such data.


Purely functional data structures have the following advantages:

  • Persistence: old versions can be reused safe in the knowledge that they cannot have been changed.

  • Sharing: many versions of a data structure can be kept simultaneously with only modest memory requirements.

  • Thread safety: any mutation is hidden inside the lazy thunks (if any) and, therefore, handled by the language implementation.

  • Simplicity: not having to keep track of state changes makes purely functional data structures simpler to use, particularly in the context of concurrency.

  • Incrementality: purely functional data structures are composed of many tiny parts, making them ideal for incremental garbage collection leading to lower latencies.

Note that I have not listed parallelism as an advantage of purely functional data structures because I do not believe this to be the case. Efficient multicore parallelism requires predictable locality in order to leverage caches and avoid getting bottlenecked on shared access to main memory and purely functional data structures have, at best, unknown characteristics in this regard. Consequently, many programs that use purely functional data structures do not scale well when parallelized on a multicore because they spend all of their time in cache misses, contending for shared memory pathways.

What I mean by purely functional data structure is not the same as persistent data structure.

There is some confusion here. In the context of purely functional data structures, persistence is a term used to refer to the ability to refer back to previous versions of a data structure safe in the knowledge that they are still valid. This is a natural result of being purely functional and, therefore, persistence is an inherent characteristic of all purely functional data structures.


Take this little snippet of F#:

let numbers = [1; 2; 3; 4; 5]

You can say with 100% certainty that that is a immutable list of integers 1 through 5. You can pass around a reference to that list and never have to worry that the list may have been modified. That is enough reason for me to use it.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜