Match regex from right to left?

2023-01-03 01:15 问答作者：

Is there any way of matching a regex from right to left? What Im looking for is a regex that gets

MODULE WAS INSERTED              EVENT
LOST SIGNAL ON E1/T1 LINK        OFF
CRC ERROR                        EVENT
CLK IS DIFF FROM MASTER CLK SRC  OF

from this input

CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40
CLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58
CLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59
CLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32

If i could have done the matching from right to left I could have written something like everything to right of (EVENT|OFF) until the second appearance of more than one space [ ]+

The best I managed today is to get everything from (022) to EVENT with the regex

CLI MUX trap received: \([0-9]+\)[ ]+(.*[  ]+(EVENT|OFF))

But that is not really what I wanted :)

edit: What language its for? Its actually a config string for a filter we have but my guess it is using standard GNU C Regex library.

edit2: I like the answers about cutting by length but Amarghosh was probably more what I was looking for. Do not really know why I did 开发者_如何学运维not think about just cutting on length like:

^.{56}(.{39}).*$

Super thanks for the quick answers...

In .NET you could use the RightToLeft option :

Regex RE = new Regex(Pattern, RegexOptions.RightToLeft);
Match theMatch = RE.Match(Source);

With regex, you could simply replace this:

^.{56}|.{19}$

with the empty string.

But really, you only need to cut out the string from "position 56" to "string-length - 19" with a substring function. That's easier and much faster than regex.

Here's an example in JavaScript, other languages work more or less the same:

var lines = [
  'CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40',
  'CLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58',
  'CLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59',
  'CLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32'
];
for (var i=0; i<lines.length; i++) {
  alert( lines[i].substring(56, lines[i].length-19) );
}

If tokens are guaranteed to be separated by more than one space and words within the string before EVENT|OFF are guaranteed to be separated by just one space - only then you can look for single-space-separated words followed by spaces followed by EVENT or OFF

var s = "CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40"
        + "\nCLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58"
        + "\nCLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59"
        + "\nCLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32"

var r = /\([0-9]+\).+?((?:[^ ]+ )* +(?:EVENT|OFF))/g;
var m;
while((m = r.exec(s)) != null)
  console.log(m[1]);

Output:

MODULE WAS INSERTED              EVENT
LOST SIGNAL ON E1/T1 LINK        OFF
CRC ERROR                        EVENT
CLK IS DIFF FROM MASTER CLK SRC  OFF

Regex: /$[0-9]+$.+?((?:[^ ]+ )* +(?:EVENT|OFF))/g

\([0-9]+\)       #digits in parentheses followed by  
.+?              #some characters - minimum required (non-greedy)  
(                #start capturing 
(?:[^ ]+ )*      #non-space characters separated by a space  
` +`             #more spaces (separating string and event/off - 
                 #backticks added for emphasis), followed by
(?:EVENT|OFF)    #EVENT or OFF
)                #stop capturing

Does the input file fit nicely into fixed width tabular text like this? Because if it does, then the simplest solution is to just take the right substring of each line, from column 56 to column 94.

In Unix, you can use the cut command:

cut -c56-94 yourfile

A regex solution

This is most likely not necessary, but something like this works (as seen on ideone.com):

line.replaceAll(".*  \\b(.+  .+)   \\S+ \\S+", "$1")

As you can see, it's not very readable, and you have to know your regex to really understand what's going on.

Essentially you match this to each line:

.*  \b(.+  .+)   \S+ \S+

And you replace it with whatever group 1 matched. This relies on the usage of two consecutive spaces exclusively for separating the columns in this table.

How about

.{56}(.*(EVENT|OFF))

Can you do field-oriented processing, rather than a regex? In awk/sh, this would look like:

< $datafile awk '{ print $(NF-3), $(NF-2) }' | column

which seems rather cleaner than specifying a regex.

继续阅读：regex

Match regex from right to left?

See also

A regex solution

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

See also

A regex solution

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？