开发者

Scala: build a Map from a List of tuples, but fail if there are contradictory entries

I think this might be a common operation. So maybe it's inside th开发者_如何学Pythone API but I can't find it. Also I'm interested in an efficient functional/simple solution if not.

Given a sequence of tuples ("a" -> 1, "b" ->2, "c" -> 3) I want to turn it into a map. That's easy using TraversableOnce.toMap. But I want to fail this construction if the resulting map "would contain a contradiction", i.e. different values assigned to the same key. Like in the sequence ("a" -> 1, "a" -> 2). But duplicates shall be allowed.

Currently I have this (very imperative) code:

def buildMap[A,B](in: TraversableOnce[(A,B)]): Option[Map[A,B]] = {
  val map = new HashMap[A,B]
  val it = in.toIterator
  var fail = false
  while(it.hasNext){
    val next = it.next()
    val old = map.put(next._1, next._2)
    fail = old.isDefined && old.get != next._2
  }

  if(fail) None else Some(map.toMap)
}

Side Question

Is the final toMap really necessary? I get a type error when omitting it, but I think it should work. The implementation of toMap constructs a new map which I want to avoid.


As always when working with Seq[A] the optimal solution performance-wise depends on the concrete collection type. A general but not very efficient solution would be to fold over an Option[Map[A,B]]:

def optMap[A,B](in: Iterable[(A,B)]): Option[Map[A,B]] = 
  in.iterator.foldLeft(Option(Map[A,B]())) {
    case (Some(m),e @ (k,v)) if m.getOrElse(k, v) == v => Some(m + e)
    case _ => None
  }

If you restrict yourself to using List[A,B]s an optimized version would be:

@tailrec
def rmap[A,B](in: List[(A,B)], out: Map[A,B] = Map[A,B]()): Option[Map[A,B]] = in match {
  case (e @ (k,v)) :: tail if out.getOrElse(k,v) == v =>
    rmap(tail, out + e)
  case Nil =>
    Some(out)
  case _ => None
}

Additionally a less idiomatic version using mutable maps could be implemented like this:

def mmap[A,B](in: Iterable[(A,B)]): Option[Map[A,B]] = {
  val dest = collection.mutable.Map[A,B]()

  for (e @ (k,v) <- in) {
    if (dest.getOrElse(k, v) != v) return None
    dest += e
  }

  Some(dest.toMap)
}


Here is a fail-slowly solution (if creating the entire map and then discarding it is okay):

def uniqueMap[A,B](s: Seq[(A,B)]) = {
  val m = s.toMap
  if (m.size == s.length) Some(s) else None
}

Here is a mutable fail-fast solution (bail out as soon as the error is detected):

def uniqueMap[A,B](s: Seq[(A,B)]) = {
  val h = new collection.mutable.HashMap[A,B]
  val i = s.iterator.takeWhile(x => !(h contains x._1)).foreach(h += _)
  if (h.size == s.length) Some(h) else None
}

And here's an immutable fail-fast solution:

def uniqueMap[A,B](s: Seq[(A,B)]) = {
  def mapUniquely(i: Iterator[(A,B)], m: Map[A,B]): Option[Map[A,B]] = {
    if (i.hasNext) {
      val j = i.next
      if (m contains j._1) None
      else mapUniquely(i, m + j)
    }
    else Some(m)
  }
  mapUniquely(s.iterator, Map[A,B]())
}

Edit: and here's a solution using put for speed (hopefully):

def uniqueMap[A,B](s: Seq[(A,B)]) = {
  val h = new collection.mutable.HashMap[A,B]
  val okay = s.iterator.forall(x => {
    val y = (h put (x._1,x._2))
    y.isEmpty || y.get == x._2 
  })
  if (okay) Some(h) else None
}

Edit: now tested, and it's ~2x as fast on input that works (returns true) than Moritz' or my straightforward solution.


Scala 2.9 is near, so why not to take advantage of the combinations method (inspired by Moritz's answer):

def optMap[A,B](in: List[(A,B)]) = {
  if (in.combinations(2).exists { 
    case List((a,b),(c,d)) => a == c && b != d
    case _ => false 
  }) None else Some(in.toMap)
}

scala> val in = List(1->1,2->3,3->4,4->5,2->3)
in: List[(Int, Int)] = List((1,1), (2,3), (3,4), (4,5), (2,3))

scala> optMap(in)
res29: Option[scala.collection.immutable.Map[Int,Int]] = Some(Map(1 -> 1, 2 -> 3, 3 -> 4, 4 -> 5))

scala> val in = List(1->1,2->3,3->4,4->5,2->3,1->2)
in: List[(Int, Int)] = List((1,1), (2,3), (3,4), (4,5), (2,3), (1,2))

scala> optMap(in)
res30: Option[scala.collection.immutable.Map[Int,Int]] = None


You can also use gourpBy as follows:

  val pList = List(1 -> "a", 1 -> "b", 2 -> "c", 3 -> "d")

  def optMap[A,B](in: Iterable[(A,B)]): Option[Map[A,B]] = {
    Option(in.groupBy(_._1).map{case(_, list) => if(list.size > 1) return None else list.head})
  }

  println(optMap(pList))

It's efficiency is competitive to the above solutions. In fact if you examine the gourpBy implementation you will see that it is very similar to some of the solutions suggested.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜