Regex to match lines in-between two specific lines, in Python
I am trying to use regex to parse out some lines from text read in from a file. I know this could be done by reading in the file, line-by-line, but I like the elegance in capturing all the relevant bits of info in a single regex match.
The example file contents:
---
title: a title
layout: page
---
here's some text
================
this will be blog post content.
I am trying to produce a regex match that will return 2 groups: the data in-between the "---" lines, and all of the data after the 2nd "---" line. Here is the regex string I have come up with, and I am having an issue with it:
re.match('---\n(.*?)\n---\n(.*)', content, re.S)
This seems to work well, except when dealing with unix vs windows line-endings. Is there a way to allow this regex to match a \r if it's present, too? It works wi开发者_运维百科th the unix, which is just \n
I believe.
Also, if you think this regex could be improved, I'm open to suggestions.
The end of line markers are considered whitespace so you can use the construct \s+
to match the end of line (and other whitespace) that is platform independent.
The sequence (\r\n|\r|\n)
will match all 'normal' line endings (Windows, old Mac, and *nix, respectively).
精彩评论