regex help - php [duplicate]
Possible Duplicate:
How to parse HTML with PHP?
I would be most grateful if a regex master among you would be kind enough to help me.
I'd like to mak开发者_如何学编程e a php function that converts html tags/elements, as per the following:
I want to convert
<span class="heading1">Any generic text, or other html elements such as <p> tags</p> in here</span>
To
<h1 class="heading1">Any text, or other html elements such as <p> tags</p> in here</h1>
...So basically I want to convert the span headings to proper h1 tags (this is for the purpose of better SEO) but there could be other normal span tags that I want to preserve.
Any ideas? Thanks in advance.
Well, as the commenters above pointed out, it's probably not a good idea. However, since this case is extremely simple, the regex would be pretty easy if you want to live on the edge:
preg_replace('/<(\/*)span/', '<${1}h1', $htmlFile);
This will replace all span
tags with h1
tags. Note that if there is any deviation from the format, it will break. Hence the warnings against this method. I would only recommend it if you are working with a small number of relatively small HTML files, so you can check them for errors.
EDIT: Yeah, if you only want to replace ones with class="heading1"
I'm not touching it. That would require more mucking about with the regex than it would probably take to just fix all the files manually.
EDIT 2: Okay, I'm a little bored and curious, so I'm going to see if I can come up with a regex that would replace all class="heading1"
spans and their corresponding closing tags with h1's:
preg_replace('/<span class="heading1">(.*(.*<span.*>.*<\/span>.*)*.*)<\/span>/', '<h1 class="heading1">${1}</h1>', $htmlFile);
If my calculations are correct, this should ignore any matching sets of span
tags inside the heading1
span
tags.
You're still probably better off using a DOM parser though.
精彩评论