Filter List in Functional Programming Style
We have a list of Strings with B开发者_JAVA百科EGIN and END marks as parts of this list. Can we filter out elements in-between of BEGIN-END in functional programming style? I came out only with this regular (flag) approach in scala.
val list1 =
"""992
1010
1005
1112
BEGIN
1086
1244
1107
1121
END
1223
1312
1319
1306
1469""".lines.toList
var flag = false
val filteredList = list1.filter{
def f(x: String): Boolean = {
if (x.contains("BEGIN")) {
flag = true;
return false
} else if (x.contains("END")) {
flag = false
}
flag
}
f
}
Is that possible to avoid defining the flag variable? How do they solve this in pure functional languages?
You can use drop
/tail
, dropWhile
, takeWhile
functions:
val filteredList = list1.map(_.trim).dropWhile("BEGIN" !=).tail.takeWhile("END" !=)
EDIT
As mentioned in comments tail
will throw exception if list is empty, so if you prefer to stay on the safe side use drop(1)
instead of tail
:
val filteredList = list1.map(_.trim).dropWhile("BEGIN" !=).drop(1).takeWhile("END" !=)
And here is my version of algorithm that handles several BEGIN and END sections (some crazy stuff from me - a little state machine :)
var filteredList1 = list1.map(_.trim).foldLeft(List(None): List[Option[List[String]]]) {
case (None :: rest, "BEGIN") => Some(Nil) :: rest
case (Some(list) :: rest, "END") => None :: Some(list) :: rest
case (Some(current) :: rest, num) => Some(num :: current) :: rest
case (result, _) => result
}.flatten.reverse map (_.reverse)
it returns List[List[String]]
To start with, each string in your list contains the spaces from the start of the line.
This is the biggest problem in your code, and there's two ways to fix it.
Either trim the lines...
val list1 =
"""992
1010
...
1306
1469""".lines.map(_.trim).toList
... or you can precede each line with a |
and use stripMargin
.
Then it's just a small matter of applying takeWhile
/dropWhile
list1.takeWhile("BEGIN" !=) ++ list1.dropWhile("END"!=).tail
or more efficiently:
val (begin,middle) = list1.span("BEGIN" !=)
val end = middle.dropWhile("END" !=).tail
begin ++ end
EDIT
I had the solution back to front, that would drop (filter out) values between BEGIN
and END
. To retain them:
list1.dropWhile("BEGIN" !=).tail.takeWhile("END"!=)
EDIT 2
Rising to the challenge here... I'll allow for multiple BEGIN/END blocks, but also consider that the input might be badly malformed. What if there was a BEGIN without a corresponding END? Perhaps there are two BEGINs in a row, or the list runs out before there's an END.
Defining some rules:
- An END without a corresponding BEGIN is ignored
- BEGIN/END blocks don't nest
- A BEGIN encountered while already in a block starts a new block
- If the list runs out while in a block, then an implicit END is assumed
Without further ado, first create an iterator that identifies every "BEGIN"
in the input:
val blocksStarts =
Iterator.iterate(list1)(_.dropWhile("BEGIN" !=).drop(1)).drop(1).takeWhile(Nil !=)
//This iterator tries to continue forever,
//returning Nils once the sequences are exhausted
//For this reason, we must use drop(1) instead of tail
Giving an iterator of lists, each starting at a "BEGIN"
To then take elements from each of these lists until the corresponding "END"
is reached, or another "BEGIN"
, or the list is exhausted:
val blocks = blockStarts map {
_.takeWhile(x => x != "BEGIN" && x != "END")
} toList
The final toList
is because it's still an Iterator
at that point. You now have a List of Lists, each corresponding to a batch of elements in a "Block", as defined by the previous rules.
I'm extending others' answers a little to present a case where there are two BEGIN
...END
blocks in the list.
val list1 =
"""992
1010
1005
1112
BEGIN
1086
1244
1107
1121
END
1223
1312
BEGIN
773
990
224
END
1319
1306
1469""".lines.map(_.trim).toList
We're going to use foldRight
to pass a status accumulator between iterations.
Note that we're using foldRight
to make constructing the result list efficient, so we're going to encounter END
before we encounter BEGIN
.
case class StripStatus(list:List[String], retaincurrent:Boolean)
list1.foldRight(StripStatus(Nil,false)){ (curElem:String, curStatus:StripStatus) =>
if (curElem == "END")
StripStatus(curStatus.list,true)
else if (curElem == "BEGIN")
StripStatus(curStatus.list,false)
else if (curStatus.retaincurrent)
StripStatus(curElem::curStatus.list, true)
else
curStatus
}.list
We could just as easily use foldLeft
and reverse
the result list at the end:
list1.foldLeft(StripStatus(Nil,false)){ (curStatus:StripStatus, curElem:String) =>
if (curElem == "BEGIN")
StripStatus(curStatus.list,true)
else if (curElem == "END")
StripStatus(curStatus.list,false)
else if (curStatus.retaincurrent)
StripStatus(curElem::curStatus.list, true)
else
curStatus
}.list.reverse
MMmm. Here's my take:
def getInside(l: List[String]) = {
def concat(in: List[String], out: List[String]): List[String] = in ::: off(out)
def off(l: List[String]): List[String] =
if (l.isEmpty) Nil
else on(l dropWhile ("BEGIN" !=) drop 1)
def on(l: List[String]): List[String] =
if (l.isEmpty) Nil
else (concat _).tupled(l span ("END" !=))
off(l)
}
I don't know Scala, but you can define a function that returns the index in the list of the next element that matches a substring, and returns the index where the substring was found and also the list of elements encountered until that substring was matched. A pseudocode header: findSubstr(list, startIndex)
. Then build the expression (more pseudocode):
beginIndex, preBeginElems = findSubstr(list, 0)
endIndex, inBetweenElems = findSubstr(list, beginIndex)
restElems = list[endIndex until the end]
If helpful, I could write this in Haskell... :)
EDIT: There are probably other ways to do it, too
Again, with the same goal of dealing with multiple BEGIN
...END
spans in the list.
def getBetweenBeginEnd(l:List[String]) = {
def internal(l:List[String],accum:List[String]):List[String]={
val (keep, keepChecking) = l.dropWhile("BEGIN" !=).drop(1).span("END" !=)
if (keepChecking == Nil)
accum:::keep
else
internal(keepChecking.tail,accum:::keep)
}
internal(l,Nil)
}
精彩评论