开发者

Recursively sort non-contiguous list to list of contiguous lists

I've been trying to learn a bit of functional programming (with Haskell & Erlang) lately and I'm always amazed at the succinct solutions people can come up with when they can think recursively and know the tools.

I want a function to convert a list of sorted, unique, non-contiguous integers into a list of contiguous lists, i.e:

[1,2,3,6,7,8,10,11]

to:

开发者_如何学Go
[[1,2,3], [6,7,8], [10,11]

This was the best I could come up with in Haskell (two functions)::

make_ranges :: [[Int]] -> [Int] -> [[Int]]
make_ranges ranges [] = ranges
make_ranges [] (x:xs)
    | null xs = [[x]]
    | otherwise = make_ranges [[x]] xs
make_ranges ranges (x:xs)
    | (last (last ranges)) + 1 == x =
        make_ranges ((init ranges) ++ [(last ranges ++ [x])]) xs
    | otherwise = make_ranges (ranges ++ [[x]]) xs

rangify :: [Int] -> [[Int]]
rangify lst = make_ranges [] lst    

It might be a bit subjective but I'd be interested to see a better, more elegant, solution to this in either Erlang or Haskell (other functional languages too but I might not understand it.) Otherwise, points for just fixing my crappy beginner's Haskell style!


Most straightforward way in my mind is a foldr:

ranges = foldr step []
    where step x [] = [[x]]
          step x acc@((y:ys):zs) | y == x + 1 = (x:y:ys):zs
                                 | otherwise  = [x]:acc

Or, more concisely:

ranges = foldr step []
    where step x ((y:ys):zs) | y == x + 1 = (x:y:ys):zs
          step x acc = [x]:acc

But wait, there's more!

abstractRanges f = foldr step []
    where step x ((y:ys):zs) | f x y = (x:y:ys):zs
          step x acc = [x]:acc

ranges = abstractRanges (\x y -> y == x + 1)
powerRanges = abstractRanges (\x y -> y == x*x) -- mighty morphin

By turning the guard function into a parameter, you can group more interesting things than just +1 sequences.

*Main> powerRanges [1,1,1,2,4,16,3,9,81,5,25]
[[1,1,1],[2,4,16],[3,9,81],[5,25]]

The utility of this particular function is questionable...but fun!


I can't believe I got the shortest solution. I know this is no code golf, but I think it is still quite readable:

import GHC.Exts
range xs = map (map fst) $ groupWith snd $ zipWith (\a b -> (a, a-b)) xs [0..]

or pointfree

range = map (map snd) . groupWith fst . zipWith (\a b -> (b-a, b)) [0..]

BTW, groupWith snd can be replaced with groupBy (\a b -> snd a == snd b) if you prefer Data.List over GHC.Exts

[Edit]

BTW: Is there a nicer way to get rid of the lambda (\a b -> (b-a, b)) than (curry $ (,) <$> ((-) <$> snd <*> fst) <*> snd) ?

[Edit 2]

Yeah, I forgot (,) is a functor. So here is the obfuscated version:

range = map (map fst) . groupWith snd . (flip $ zipWith $ curry $ fmap <$> (-).fst <*> id) [0..]

Suggestions are welcome...


import Data.List (groupBy)

ranges xs = (map.map) snd 
          . groupBy (const fst) 
          . zip (True : zipWith ((==) . succ) xs (tail xs))
          $ xs

As to how to come up with such a thing: I started with the zipWith f xs (tail xs), which is a common idiom when you want to do something on consecutive elements of a list. Likewise is zipping up a list with information about the list, and then acting (groupBy) upon it. The rest is plumbing.

Then, of course, you can feed it through @pl and get:

import Data.List (groupBy)
import Control.Monad (ap)
import Control.Monad.Instances()

ranges = (((map.map) snd) 
           . groupBy (const fst)) 
         .) =<< zip 
         . (True:) 
         . ((zipWith ((==) . succ)) `ap` tail)

, which, by my authoritative definition, is evil due to Mondad ((->) a). Twice, even. The data flow is meandering too much to lay it out in any sensible way. zipaptail is an Aztec god, and Aztec gods aren't to be messed with.


Another version in Erlang:

part(List) -> part(List,[]).
part([H1,H2|T],Acc) when H1 =:= H2 - 1 -> 
    part([H2|T],[H1|Acc]);
part([H1|T],Acc) ->
    [lists:reverse([H1|Acc]) | part(T,[])];
part([],Acc) -> Acc.


k z = map (fst <$>) . groupBy (const snd) . 
        zip z . (False:) . (zipWith ((==) . succ) <*> tail) $ z


Try reusing standard functions.

import Data.List (groupBy)

rangeify :: (Num a) => [a] -> [[a]]
rangeify l = map (map fst) $ groupBy (const snd) $ zip l contigPoints
  where contigPoints = False : zipWith (==) (map (+1) l) (drop 1 l)

Or, following (mixed) advice to use unfoldr, stop abusing groupBy, and be happy using partial functions when it doesn't matter:

import Control.Arrow ((***))
import Data.List (unfoldr)

spanContig :: (Num a) => [a] -> [[a]]
spanContig l =
    map fst *** map fst $ span (\(a, b) -> a == b + 1) $ zip l (head l - 1 : l)

rangeify :: (Num a) => [a] -> [[a]]
rangeify = unfoldr $ \l -> if null l then Nothing else Just $ spanContig l


Erlang using foldr:

ranges(List) ->
    lists:foldr(fun (X, [[Y | Ys], Acc]) when Y == X + 1 ->
                        [[X, Y | Ys], Acc];
                    (X, Acc) ->
                        [[X] | Acc]
                end, [], List).


This is my v0.1 and I can probably make it better:

makeCont :: [Int] -> [[Int]]
makeCont [] = []
makeCont [a] = [[a]]
makeCont (a:b:xs) = if b - a == 1
                        then (a : head next) : tail next
                        else [a] : next
                    where
                        next :: [[Int]]
                        next = makeCont (b:xs)

And I will try and make it better. Edits coming I think.


As a comparison, here's an implementation in Erlang:

partition(L)  -> [lists:reverse(T) || T <- lists:reverse(partition(L, {[], []}))].

partition([E|L], {R, [EL|_] = T}) when E == EL + 1 -> partition(L, {R, [E|T]});
partition([E|L], {R, []})                          -> partition(L, {R, [E]});
partition([E|L], {R, T})                           -> partition(L, {[T|R], [E]});
partition([],    {R, []})                          -> R;
partition([],    {R, T})                           -> [T|R].


The standard paramorphism recursion scheme isn't in Haskell's Data.List module, though I think it should be. Here's a solution using a paramorphism, because you are building a list-of-lists from a list, the cons-ing is a little tricksy:

contig :: (Eq a, Num a) => [a] -> [[a]]
contig = para phi [] where
  phi x ((y:_),(a:acc)) | x + 1 == y = (x:a):acc
  phi x (_,       acc)               = [x]:acc

Paramorphism is general recursion or a fold with lookahead:

para :: (a -> ([a], b) -> b) -> b -> [a] -> b
para phi b []     = b
para phi b (x:xs) = phi x (xs, para phi b xs)


It can be pretty clear and simple in the Erlang:

partition([]) -> [];
partition([A|T]) -> partition(T, [A]).

partition([A|T], [B|_]=R) when A =:= B+1 -> partition(T, [A|R]);
partition(L, P) -> [lists:reverse(P)|partition(L)].

Edit: Just for curiosity I have compared mine and Lukas's version and mine seems about 10% faster either in native either in bytecode version on testing set what I generated by lists:usort([random:uniform(1000000)||_<-lists:seq(1,1000000)]) on R14B01 64b version at mine notebook. (Testing set is 669462 long and has been partitioned to 232451 sublists.)

Edit2: Another test data lists:usort([random:uniform(1000000)||_<-lists:seq(1,10000000)]), length 999963 and 38 partitions makes bigger diference in native code. Mine version finish in less than half of time. Bytecode version is only about 20% faster.

Edit3: Some microoptimizations which provides additional performance but leads to more ugly and less maintainable code:

part4([]) -> [];
part4([A|T]) -> part4(T, A, []).

part4([A|T], B, R) when A =:= B+1 -> part4(T, A, [B|R]);
part4([A|T], B, []) -> [[B]|part4(T, A, [])];
part4([A|T], B, R) -> [lists:reverse(R, [B])|part4(T, A, [])];
part4([], B, R) -> [lists:reverse(R,[B])].


Here's an attempt from a haskell noob

ranges ls = let (a, r) = foldl (\(r, a@(h:t)) e -> if h + 1 == e then (r, e:a) else (a:r, [e])) ([], [head ls]) (tail ls)
            in           reverse . map reverse $ r : a
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜