Regex get content between two pipes AND return a space where two pipes are next to each other with no spaces
How can I get all content between pipes and return a space where it comes across two pipes next to each other?
An example string and desired output is:
开发者_开发技巧|test1| test2|test3 || test 4 |
Result1: "test1"
Result2: "test2"
Result3: "test3"
Result4: " "
Result5: "test4"
The closest I've got so far is:
/[^\|]+)/
which will get all data between pipes but does not detect||
./\|([^\|]*)/
which will get all data between pipes and detect||
but have an extra whitespace result at the end.
This is not possible with a regular expression alone - regexes can only return text they have matched, not create new text.
So you'll have to detect programmatically whether there was an empty match and change the result to a single space. What language are you using?
As an example, in C# you could do this:
Regex regexObj = new Regex(@"(?<=\|\s*).*?(?=\s*\|)", RegexOptions.Multiline);
Match matchResults = regexObj.Match(subjectString);
while (matchResults.Success) {
text = matchResults.Value
if (text == "") {
text = " "
}
// now do whatever you want with it
matchResults = matchResults.NextMatch();
}
In Ruby, you don't have lookbehind until version 1.8, so you need a different approach. First remove leading and trailing delimiters:
temp = subject.gsub(/^\s*\|\s*|\s*\|\s*$/, '')
Then split along the remaining delimiters:
result = temp.split(/\s*\|\s*/)
and then iterate over the array you get, replacing empty strings with spaces.
In Ruby I'd not bother with a regex:
str = '|test1| test2|test3 || test 4 |'
str.split('|')[1 .. -1].map{ |s| (s.strip.empty?) ? ' ': s.strip } #=> ["test1", "test2", "test3", " ", "test 4"]
You can split the string with \s*\|\s*
and get an array with each of the pieces. Without knowing what language you are using, I can't say what the specific API would for doing regular expression split on a string.
As already mentioned by Tim it is not possible using just a regex.
One way to do it is:
- Remove the leading and trailing pipe.
- Split the string on space(s) followed by pipe followed by space(s).
- If you find any piece to be empty, make it
" "
.
In Perl:
$str = '|test1| test2|test3 || test 4 |';
$str =~s/^\||\|$//;
@pieces = split/\s*\|\s*/,$str;
foreach(@pieces) {
$_ = ' ' if($_ eq '');
print $_,"\n";
}
(?<=\|)([^\|]*)(?=|)
should do what you want. It uses positive and negative lookarounds, so it will not consume the pipes from being used in other matches.
This will give you the results: "test1"
, " test2"
, "test3 "
, ""
, and " test 4 "
.
If you want to trim your results using regex, use (?<=\|)\s*([^\|]*)\s*(?=|)
, giving you "test1"
, "test2"
, "test3"
, ""
, and "test 4"
.
Test 4 is tougher, because you cannot remove the internal space. And, as mentioned, regular expressions cannot create text, so it is impossible to return " "
between tests 3 and 4. Of course, you can test for empty strings and replace them later, using whatever other language you are using.
精彩评论