How to do list concatenation "the correct" way (using tail-recursion)
I'm working on the following Erlang exercise:
Write a function that, given a list of lists, will concatenate them. Example:
concatenate([[1,2,3], [], [4,five]]) ⇒ [1,2,3,4,five].
And I came up with this:
concatenate([]) ->
[];
concatenate([[H]|Tail]) ->
[H|concatenate(Tail)];
concatenate([[]|Tail]) ->
concatenate(Tail);
concatenate([[H|T]|Tail]) ->
[H|concatenate([T|Tail])].
Which works, but I noticed that I'm doing this [T|Tail]
thing.
First question
Is this still considered direc开发者_如何学Got recursion?
After that, I got rid of that [T|Tail]
and used an accumulator instead (like below).
Second question
Is now the second code considered tail recursion?
The book hints to the use of a helper function (which the second code is doing), but it seems quite verbose. Is it because I'm missing something?
concatenate([]) ->
[];
concatenate([[H]|Tail]) ->
[H|concatenate(Tail)];
concatenate([[]|Tail]) ->
concatenate(Tail);
concatenate([[H|T]|Tail]) ->
[H|concatenate(T,Tail)].
concatenate([],Tail) ->
concatenate(Tail);
concatenate([H],Tail) ->
[H|concatenate(Tail)];
concatenate([H|T],Tail) ->
[H|concatenate(T,Tail)].
As @Yasir explained neither are tail-recursive, but I wouldn't worry that much about it (see below). Using a helper function can improve the code by eliminating the partial rebuilding of the input list. Your code is a little verbose and can be simplified by removing some unnecessary clauses in conc/1
and always calling conc/2
instead:
conc([H|T]) -> conc(H, T);
conc([]) -> [].
conc([H|T], Rest) -> [H|conc(T, Rest)];
conc([], Rest) -> conc(rest).
Splitting the tail-recursive version with an accumulator would then become:
conc(List) -> conc(List, []).
conc([H|T], Acc) -> conc(H, T, Acc);
conc([], Acc) -> lists:reverse(Acc).
conc([H|T], Rest, Acc) -> conc(T, Rest, [H|Acc]);
conc([], Rest, Acc) -> conc(Rest, Acc).
Now, the difference in speed is much less these days than it used to be, see Myth: Tail-recursive functions are MUCH faster than recursive functions, so it is best to use the style which seems clearer. I personally prefer not to use an accumulator unless I have to.
Both [H|concatenate([T|Tail])]
and [H|concatenate(T,Tail)]
are not tail-recursive calls, because either calls are a part of another expression, and thus control would be returned to the expression, which includes a call to your concatenate/1,2
.
The proper tail-recursive conc
could look something like:
-module(concat).
-export([concatenate/1]).
concatenate(L) ->
conc(L, []).
conc([], Acc) ->
lists:reverse(Acc);
conc([[H|T] | L1], Acc)->
conc([T|L1], [H|Acc]);
conc([[] | L1], Acc) ->
conc(L1, Acc).
Here, in conc/2
, the call to itself is the last operation in the function body, and the function will never return.
EDIT: If we forget about optimizations of non-tail-recursive calls, @Robert mentioned, for the moment, we could end up with memory overflow due to return adresses of calling functions passed onto the stack (heap?). This could happen if you called your non-tail-recursive function passing it a list with a considerable length on a system with insufficient memory size to hold such amount of return addresses.
I've also implemented list concatenation in Continuation-passing style:
-module(concat).
-export([concat_cps/2]).
concat_cps([], F) ->
F([]);
concat_cps([H|T], F) ->
concat_cps_1(H, T, F).
concat_cps_1([H|T], Rest, F) ->
concat_cps_1(T, Rest, fun(X) -> F([H|X]) end);
concat_cps_1([], Rest, F) ->
concat_cps(Rest, F).
So, if someone's had enough of just recursion, (s)he could boost the flow control with Closures and Continuations as shown above.
Test:
1> concat:concat_cps([[1,2,3], [], [4,5]], fun(X) -> X end).
[1,2,3,4,5]
The second argument to concat_cps/2
is the continuation, which, when concat_cps/2
is done, accepts the result of the latter. The control to the concat_cps/2
is then never returned.
In the example above we just used the identity morphism, though we could pass in any other valid continuation, which accepts a flat list, e. g.:
2> concat:concat_cps([[1,2,3], [], [4,5]], fun(X) -> length(X) end).
5
The obvious solution is to use the standard library if the functionality is there, which in this case it is as lists:append/1. The specifics of the implementation might change over the years, but today's version is very simple (and not tail-recursive):
%% append(L) appends the list of lists L
-spec append([[T]]) -> [T].
append([E]) -> E;
append([H|T]) -> H ++ append(T);
append([]) -> [].
精彩评论