Regex with lookahead
I can't seem to make this regex work.
The input is as follows. Its really on one row but I have inserted line brea开发者_如何学JAVAks after each \r\n so that it's easier to see, so no check for space characters are needed.
01-03\r\n
01-04\r\n
TEXTONE\r\n
STOCKHOLM\r\n
350,00\r\n ---- 350,00 should be the last value in the first match
12-29\r\n
01-03\r\n
TEXTTWO\r\n
COPENHAGEN\r\n
10,80\r\n
This could go on with another 01-31 and 02-01, marking another new match (these are dates).
I would like to have a total of 2 matches for this input. My problem is that I cant figure out how to look ahead and match the starting of a new match (two following dates) but not to include those dates within the first match. They should belong to the second match.
It's hard to explain, but I hope someone will get me. This is what I got so far but its not even close:
(.*?)((?<=\\d{2}-\\d{2}))
The matches I want are:
1: 01-03\r\n01-04\r\nTEXTONE\r\nSTOCKHOLM\r\n350,00\r\n
2: 12-29\r\n01-03\r\nTEXTTWO\r\nCOPENHAGEN\r\n10,80\r\n
After that I can easily separate the columns with \r\n.
Can this more explicit pattern work to you?
(\d{2}-\d{2})\r\n(\d{2}-\d{2})\r\n(.*)\r\n(.*)\r\n(\d+(?:,?\d+))
Here's another option for you to try:
(.+?)(?=\d{2}-\d{2}\\r\\n\d{2}-\d{2}|$)
Rubular
/
\G
(
(?:
[0-9]{2}-[0-9]{2}\r\n
){2}
(?:
(?! [0-9]{2}-[0-9]{2}\r\n ) [^\n]*\n
)*
)
/xg
Why do so much work?
$string = q(01-03\r\n01-04\r\nTEXTONE\r\nSTOCKHOLM\r\n350,00\r\n12-29\r\n01-03\r\nTEXTTWO\r\nCOPENHAGEN\r\n10,80\r\n);
for (split /(?=(?:\d{2}-\d{2}\\r\\n){2})/, $string) {
print join( "\t", split /\\r\\n/), "\n"
}
Output:
01-03 01-04 TEXTONE STOCKHOLM 350,00
12-29 01-03 TEXTTWO COPENHAGEN 10,80`
精彩评论