How to force Python to ignore re.DOTALL in re.findall() statement?

2022-12-15 16:22 问答作者：

I have been banging my head against the keyboard in search of enlightenment through Google and all Python docs I could get my hands on, but could not find an answer to an issue I'm encountering.

I have the following regex that I run against a website, but Python insists in setting re.DOTALL on it, even though my code does not tell it to:

\d+. +(?P<season>\d+) *\- *(?P<episode>\d+).*?(?P<day>\d+)(?:\/|\s)+(?P<month>[A-Za-z]+)(?:\/|\s)+(?P<year>\d+) +(?:<a .+><img .+></a>)? ?<a .*?>(?P<name>.*?)</a>

This creates an array of seasons/episodes for TV sho开发者_运维知识库w listings, and it works fine except on epguides.com/BurnNotice (when using the TVRage listings), due to some spacing before newlines (I guess).

Using http://re-try.appspot.com to test, I've narrowed down the issue to the use of re.DOTALL. If I enable it on re-try, it replicates the results I get when I run it standalone on my script. If I untick DOTALL, then it gives me the results I expect.

How can I force Python NOT to use re.DOTALL?

The script runs both on Ubuntu and OS X.

.+> should change to [^>]+> and

.*?> to [^>]*>

You can try replacing others dots into [^\r\n] too, but above 2 changes should be enough.

继续阅读：flags python regex

How to force Python to ignore re.DOTALL in re.findall() statement?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？