Regular Expression to find a strings between two tokens, while EXCLUDING the tokens AND the start token is the same as the end token [closed]

2023-03-31 03:26 问答作者：

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center. Closed 11 years ago.

An extension of Regular Expression to find a string included between two characters, while EXCLUDING the delimiters

The solution to that question modified a tiny bit:

(?<=\#)(.*?)(?=\#)

Given a string "The #iPhone 4# is made by #apple#." that solution returns:

["iP开发者_如何学Chone 4", " is made by ", "apple"]

Now I'm not sure if this is possible using only a regex, but in this case " is made by " is not supposed to be returned. It simply happens to be squashed between the other two ## wrapped strings, and so is wrapped itself.

Clarification: The regex needs to support a variable number of #foo# strings in the parent string. There will not always be only 2.

Update

Due to the varied responses, and the realization that this problem is more simply solved without regex, I'm voting to close the question. Answer: do this without regex, in the language of your choice.

Very close to @Gerben, but for me working: (there should be an odd amount of '#' before the token (incl. the '#' that starts the token))

(?<=^[^#]*#([^#]*#[^#]*#)*)([^#]*)(?=#)

You can't just take (?<=\#)(.*?)(?=\#) and ignore every other match in the match list before processing on...?

The zero-width assertions cause the match to include text between all delimiters instead of continuing after each "consumed" delimiter. You have to change the code which does the matching so that it extracts, for instance, the first capture group, rather than the whole matched expression. It would help if you posted the code you are using now so we could tell you how to modify it, but your example is formatted in a Pythonesque way, so something like this;

stringlist = re.findall("#([^#]*)#", string)

Sorry, not at my computer, and my Python is not very good, so I'll probably have to get back to you with corrections.

Update: fixed and substantially simplified the code

The solution doesn't return what you say it does (it's working on square brackets rather than hash marks), but it's a question of what you put into parentheses; the parentheses are what direct the capturing.

#([^#]*)#[^#]*#([^#]*)#

Instead of .* use [^\]*] (in case when ] is dellimeter

EDITED

So you have a list #text#,#text#,.. and want to resolve items of list

(\#[^\#]*\#[,$])+

not sure if this works, but the idea would be that it only matches the first # if there are an even amount of #-characters before it.

(?<=(?:^[^#]*#[^#]*#)*#)([^#]*)(?=#)

But what language are you using? Because it would be a lot easier to do without using just regex

I am not familiar enough with regular expressions to give you a regular expression answer. But it seems that every second item of your list is to be discarded. Why not iterate the list and do that?

This is how I would do it:

text = "The #iPhone 4# is made by #apple#" 
cleanlist = list(match.strip('#') for match in re.findall('#.*?#', text, re.UNICODE))
print cleanlist
>>> ['iPhone 4', 'apple']

继续阅读：regex

Regular Expression to find a strings between two tokens, while EXCLUDING the tokens AND the start token is the same as the end token [closed]

Update

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Update

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？