开发者

Split seq in F#

I should split seq<a> into seq<seq<a>> by an attribute of the elements. If this attribute equals by a given value it must be 'splitted' at that point. How can I do that in FSharp?

It should be nice to pass a 'function开发者_如何学编程' to it that returns a bool if must be splitted at that item or no.

Sample: Input sequence: seq: {1,2,3,4,1,5,6,7,1,9} It should be splitted at every items when it equals 1, so the result should be:

seq
{
seq{1,2,3,4}
seq{1,5,6,7}
seq{1,9}
}


All you're really doing is grouping--creating a new group each time a value is encountered.

let splitBy f input =
  let i = ref 0
  input 
  |> Seq.map  (fun x -> 
    if f x then incr i
    !i, x)
  |> Seq.groupBy fst
  |> Seq.map (fun (_, b) -> Seq.map snd b)

Example

let items = seq [1;2;3;4;1;5;6;7;1;9]
items |> splitBy ((=) 1)

Again, shorter, with Stephen's nice improvements:

let splitBy f input =
  let i = ref 0
  input
  |> Seq.groupBy (fun x ->
    if f x then incr i
    !i)
  |> Seq.map snd


Unfortunately, writing functions that work with sequences (the seq<'T> type) is a bit difficult. They do not nicely work with functional concepts like pattern matching on lists. Instead, you have to use the GetEnumerator method and the resulting IEnumerator<'T> type. This often makes the code quite imperative. In this case, I'd write the following:

let splitUsing special (input:seq<_>) = seq { 
  use en = input.GetEnumerator()
  let finished = ref false
  let start = ref true
  let rec taking () = seq {
    if not (en.MoveNext()) then finished := true
    elif en.Current = special then start := true
    else 
      yield en.Current
      yield! taking() }

  yield taking()
  while not (!finished) do
    yield Seq.concat [ Seq.singleton special; taking()] }

I wouldn't recommend using the functional style (e.g. using Seq.skip and Seq.head), because this is quite inefficient - it creates a chain of sequences that take value from other sequence and just return it (so there is usually O(N^2) complexity).

Alternatively, you could write this using a computation builder for working with IEnumerator<'T>, but that's not standard. You can find it here, if you want to play with it.


The following is an impure implementation but yields immutable sequences lazily:

let unflatten f s = seq {
    let buffer = ResizeArray()

    let flush() = seq { 
        if buffer.Count > 0 then 
            yield Seq.readonly (buffer.ToArray())
            buffer.Clear() }

    for item in s do
        if f item then yield! flush()
        buffer.Add(item)

    yield! flush() }

f is the function used to test whether an element should be a split point:

[1;2;3;4;1;5;6;7;1;9] |> unflatten (fun item -> item = 1)


Probably no the most efficient solution, but this works:

let takeAndSkipWhile f s = Seq.takeWhile f s, Seq.skipWhile f s

let takeAndSkipUntil f = takeAndSkipWhile (f >> not)

let rec splitOn f s =
    if Seq.isEmpty s then
        Seq.empty
    else
        let pre, post =
            if f (Seq.head s) then
                takeAndSkipUntil f (Seq.skip 1 s)
                |> fun (a, b) ->
                    Seq.append [Seq.head s] a, b
            else
                takeAndSkipUntil f s
        if Seq.isEmpty pre then
            Seq.singleton post
        else
            Seq.append [pre] (splitOn f post)

splitOn ((=) 1) [1;2;3;4;1;5;6;7;1;9] // int list is compatible with seq<int>

The type of splitOn is ('a -> bool) -> seq<'a> -> seq>. I haven't tested it on many inputs, but it seems to work.


In case you are looking for something which actually works like split as an string split (i.e the item is not included on which the predicate returns true) the below is what I came up with.. tried to be as functional as possible :)

let fromEnum (input : 'a IEnumerator) = 
    seq {
        while input.MoveNext() do
            yield input.Current
    }

let getMore (input : 'a IEnumerator) = 
    if input.MoveNext() = false then None
    else Some ((input |> fromEnum) |> Seq.append [input.Current])

let splitBy (f : 'a -> bool) (input : 'a seq)  = 
    use s = input.GetEnumerator()
    let rec loop (acc : 'a seq seq) = 
        match s |> getMore with 
        | None -> acc
        | Some x ->[x |> Seq.takeWhile (f >> not) |> Seq.toList |> List.toSeq]
                   |> Seq.append acc
                   |> loop
    loop Seq.empty |> Seq.filter (Seq.isEmpty >> not)

seq [1;2;3;4;1;5;6;7;1;9;5;5;1]
|> splitBy ( (=) 1) |> printfn "%A"
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜