Exact real arithmetic and lazy list performance in C++/Haskell
I recently came across the subject of exact real arithmetic after reading this paper and this paper.
I have found a number of papers that discuss realizations of exact arithmetic using signed digit streams. The use of infinite streams for arbitrary precision leads to nice practical implementations in functional languages, like Haskell, using lazy lists. However, the papers that discuss such implementations in functional languages seem to come to the conclusion that performance is very poor.
Now, I appreciate that exact, non-hardware implementations will generally have relatively poor perf开发者_Go百科ormance compared to the standard floating-point representation, but I am interested in providing a more efficient implementation in an imperative language (specifically, C++) and a collection of operations/functions (arithmetic operations, trigonometric functions, exp, log, etc.).
My question(s): is there something inherently slow about a signed digit/lazy stream representation that causes the bad performance, or is it Haskell? What is it that makes it slow? Would it be possible to implement a signed digit stream representation using lazy streams in C++ that achieves (significantly) better performance than its Haskell counterpart, or is this an exercise in futility? Perhaps reconstructing as iterations?
I know there exists two C++ libraries, RealLib and iRRAM, that achieve efficient real number computation. However, these seem to use interval arithmetic, representing real numbers as shrinking nested intervals, which doesn't seem as 'pure' an approach as infinite streams (please correct me if you disagree!). But perhaps these are the only approaches that achieve good efficiency?
Any input is appreciated!
is there something inherently slow about a signed digit/lazy stream representation that causes the bad performance, or is it Haskell? What is it that makes it slow? Would it be possible to implement a signed digit stream representation using lazy streams in C++ that achieves (significantly) better performance than its Haskell counterpart, or is this an exercise in futility? Perhaps reconstructing as iterations?
Lazy streams are represented most efficiently as data with delayed function components. That's the representation the rather efficient Haskell implementation in GHC uses, and one you'll need to use in C++ anyway. There's no special "fast" version of laziness you could write in C++, that hasn't already been tried in 20 years of Haskell compiler research, and more going back to Algol.
For more details on the research on how to most efficiently implement lazy data structures, see Good introductory text about GHC implementation?
Now, given the lack of information about the benchmark, there's a couple of possible answers:
- the code was poorly written (and a better implementation might not be slow)
- representing large digits as tagged lazy bits is space inefficient, leading to all sorts of hardware bottlenecks
- lazy stream representation of numbers admit only certain, linear algorithms. A denser representation may admit algorithms with better complexity, or absolute performance.
My guess is the latter two points. The C++ version of laziness will only be hard work to get to the same point GHC already is at, so why not play with the code from the article, and see if you can make it faster.
I fear the "numbers are a lazy stream of digits" approach is doomed to be less efficient than a more direct approach, even if the digits are in a large base (says 2^64 or more):
lazy evaluation means that at the end, what you are thinking of as number are in fact the DAG representing the computation leading to it. Asking for one more digit may re-trigger computation in every node of this DAG. At the end of the day, you spend far more time in house keeping than in computation.
you probably can't use the more sophisticated algorithms for basic computation. For instance, an FFT approach to multiplication is obviously out of question.
that doesn't mean the algorithms will be simple. Think about how to handle a current result of 99999 in an addition. You need to be prepared to ripple a carry for all those 9. Now try to think about how to do a multiplication. A language with lazy expression may help expressing it, but then you will hit get harder by the first problem; I would be happy to learn that compilers are able to optimize for it, but I fear they aren't.
精彩评论