Regular expressions - check if a match is not contained in a textarea
OK, so I opened up this question yesterday and got an answer fairly quickly. It worked, or so I thought, so I marked it as the correct answer.
However I don't think I explained the situation very well. Basically I am getting the HTML right before it is rendered, parsing it and searching for strings matching the pattern [tag|text x]
, where x is a number and the two words are case-insensitive.
However, as stated in the previous question, I would like to NOT replace these tags if they're inside a textarea. This means that if they're between </textarea>
and <textarea...>
then I would still like to replace them, but if they're between <textarea...>
and </textarea>
then I would NOT like to replace them.
So far I have
@"(?<!\<textarea class='tag'\>)\[(tag|text) ([0-9]+)\]"
I have tried
@"(?<!\<textarea.[^>]*\>)\[(tag|text) ([0-9]+)\]"
but that doesn't appear to work either.
For example I would like to replace any tags outside of the textareas in the following:
[tag 1]
<textarea>[tag 2]</textarea>[tag 3]
<textarea class="bob">Walter [tag 4]</textarea>[tag 5]
<textarea attr-1="fred">Jim [tag 6] Mary</textarea>[tag 7]
[tag 8]
In this example only 开发者_开发百科tags 1, 3, 5, 7 and 8 should be replaced; 2, 4 and 6 should not.
Does anyone have any idea how what I should change it to in order to achieve this? I am not asking for anyone to just do all the work for me and give me the answer - I am in this to learn. I have struggled with this for a few hours now so any assistance with this would be great!
This kind of thing is usually easier to do with lookaheads than lookbehinds. This works as you requested:
@"\[(tag|text)\s+(\d+)\](?![^<]*(?:<(?!/?textarea\b)[^<]*)*</textarea>)"
The idea here is to look for a </textarea>
tag, but only if you don't encounter a <textarea...>
tag first--that's this part:
[^<]*(?:<(?!/?textarea\b)[^<]*)*</textarea>
Assuming the HTML is well formatted, that regex could only match inside a textarea element. Putting it in a negative lookahead which is executed after the [tag]
has been matched causes matches in textareas to be rejected.
精彩评论