How do I intersect two lists in OCaml?
When I have two lists in OCaml, for example
e1 = [3; 4; 5; 6; 7]
and
e2 = [1; 3; 5; 7; 9]
Is there an efficient way to get the intersection of those two lists? I.e.:
[3; 5; 7]
Because I don't like scanning every element in list e2 for every element in list e1, thus cr开发者_如何学编程eating a big Oh of order n^2.
As Franck and Rémi said, converting your lists to sets (from stdlib module Set) costs n log(n), and then Sets provides a linear implementation of intersection. Franck also mentioned the equivalent alternative to sort the lists, and then traverse them in a synchronized way. These are roughly the same (and by the way, in both cases you need to be able to provide a total ordering on the elements in your lists).
If intersections are an important part of your algorithm and you want them to be faster in the case of two sets of elements that are only slightly different, you need to switch to a mergeable structure such as Patricia trees. See files pt*
in http://www.lri.fr/~filliatr/ftp/ocaml/ds/ .
If you need intersection to be fast in all cases, you have the possibility to use hash-consed Patricia trees. Hash-consing helps to recognize structurally identical sub-trees, and help to build efficient caches for previous operations by making comparison cheap.
Patricia trees can not use an arbitrary type as key (typically they are presented with ints as keys). But you can sometimes work around this limitation by numbering at creation each value you intend to use as a key.
My OCaml isn't the best, but I hacked this function together which will intersect sorted lists:
let rec intersect l1 l2 =
match l1 with [] -> []
| h1::t1 -> (
match l2 with [] -> []
| h2::t2 when h1 < h2 -> intersect t1 l2
| h2::t2 when h1 > h2 -> intersect l1 t2
| h2::t2 -> (
match intersect t1 t2 with [] -> [h1]
| h3::t3 as l when h3 = h1 -> l
| h3::t3 as l -> h1::l
)
);;
which should run in O(n+m) time. Basically it checks the first element of each list. If they are equal, it stores the result of the recursive call to their tails, and then checks to see if the head of the stored result is equal to the heads of the lists. If it isn't, it inserts it, otherwise it's a duplicate and it ignores it.
If they aren't equal, it just advances whichever one is smaller.
I don't know OCaml (syntax-wise), but generally you can do this in two ways:
If your language has support for a Set-datastructure, then convert both lists into Sets and use the set-intersection operation.
More generally: Sort both lists, then scan the sorted lists, which makes finding the duplicates much more efficient. You take n log(n) for sorting and can find the duplicates in linear time then.
As @Frank suggested, you can use sets to solve this problem, although it is not the best answer ever, but here is a short code listing demonstrating how this could be achieved in OCaml :
module Int_set = Set.Make (struct
type t = int
let compare = compare
end);;
(* iters through a list to construct a set*)
let set_of_list = List.fold_left (fun acc x -> Int_set.add x acc) Int_set.empty;;
let e1 = [3; 4; 5; 6; 7];;
let e2 = [1; 3; 5; 7; 9];;
let s1 = set_of_list e1;;
let s2 = set_of_list e2;;
(*result*)
let s3 = Int_set.inter s1 s2;;
(*testing output*)
Int_set.iter (fun elt -> print_int elt;print_string "\n") s3;;
The output is :
3
5
7
- : unit = ()
If your lists only contains integers of a limited size, there is also a solution in O(n):
1.) Create an array of booleans of the size of you largest integer value plus 1 in your original lists (e.g. in your example '9+1'); set all fields to false;
let m = Array.create 10 false
-> [|false; false; false; false; false; false; false; false; false; false|]
2.) Iterate over the first list: For every element you encounter, set the boolean with the respective offset to 'true'; in your example this would yield
List.iter (fun x -> m.(x) <- true) e1
-> [|false; false; false; true; true; true; true; true; false; false|]
3.) Filter over the second list, keeping only those elements of which the corresponding field in the array is true
List.filter (fun x -> m.(x) = true) e2
-> [3; 5; 7]
I do not think my solution is O(n), but it is very short, and might be interesting for people who are not limited by complexity constraints (As this answer is one of the first search results for "Ocaml intersect list")
This function returns true when the intersection is not empty. In other words, it checks if two lists share elements, if yes, true, else false.
let intersect l1 l2 =
List.fold_left (fun acc x -> if (List.exists (fun y -> y = x) l1) then true else acc) false l2;;
Very similarly, this function returns the actual intersection.
let intersect l1 l2 =
List.fold_left (fun acc x -> if (List.exists (fun y -> y = x) l1) then x::acc else acc) [] l2;;
Feel free to correct me if my soluton is not correct. It should be polymorph, as x and y can be any type that can be compared.
As a previous poster said, this is one of the first search results for "intersect lists in OCaml," so I'm posting a simple answer here that doesn't address concerns about complexity.
# let e1 = [3; 4; 5; 6; 7];;
# let e2 = [1; 3; 5; 7; 9];;
# List.filter (fun x -> List.mem x e1) e2;;
- : int list = [3; 5; 7]
#
精彩评论