Using regular expression to parse m3u file
I am looking to parse a m3u file using a regular expression. The m3u looks like:
#EXTM3U
#EXTINF:36,Artist - Title
C:\Users\Public\Music\Sample Music\file1.mp3
#EXTINF:19,Artist - Title
C:\Users\Public\Music\Sample Music\file2.mp3
#EXTINF:19,Artist - Title (Additional Title)
C:\Users\Public\Music\Sample Music\file3.mp3
#EXTINF:57,Artist - Title - Additional Title
C:\Users\Public\Music\Sample Music\file4.mp3
When I open the file in a text editor the m3u is all in one line with no line breaks. I am looking to create two regular expression. The first one would parse the artist and title information. The regex output should be:
Artist - Title
Artist - Title
Artist - Title (Additional Title)
Artist - Title - Additional Title
The second regex should parse the same information but capture the artist and title in separate groups. The regex output should be:
Group 1
开发者_StackOverflow中文版Artist
Artist
Artist
Artist
Group 2
Title
Title
Title (Additional Title)
Title - Additional Title
Any help is appreciated.
Here's a quick thought for the first one:
#EXTINF:[0-9]+,([a-zA-Z0-9 ]+ - [a-zA-Z0-9 ]+(?: (?:- [a-zA-Z0-9 ]+|\([a-zA-Z0-9 ]+\))))?
This assumes that both Artist names and song titles will only be composed of letters, digits and spaces (i.e. [a-zA-Z0-9 ]), so adjust that to reflect what kind of songs you have and what you can think of.
Furthermore, I've used the python notation for non-capturing groups - (?:) - you may need to replace that based on whatever you'll be using this in.
From there, you can easily split the above to have two capturing groups:
#EXTINF:[0-9]+,([a-zA-Z0-9 ]+) - ([a-zA-Z0-9 ]+(?: (?:- [a-zA-Z0-9 ]+|\([a-zA-Z0-9 ]+\))))?
精彩评论