开发者

MySQL: find all records with empty anchor tags in a field (regex) and remove them

I have a table which has a column containing html code. I need to find out all records that have empty anchor tags in them, ignoring child tags. Afterwards I want to remove all those empty tags.

To clarify things - these anchors should be found, for example:

<a href="http://www.example.com" target="_blank"></a>
<a href="http://www.example.com"> </a>
<a href="http://www.example.com"><span><strong></strong></span></a>
<a href="http://www.example.com"><span></span></a>

Whereas these should not be found:

<a href="http://www.example.com">something&开发者_C百科lt;/a>
<a href="http://www.example.com"><span>some text</span></a>

I managed to get all empty anchors (like in the first example) with this query:

select * from table_a where html regexp '<a.*href=".*".*></a>'

But this does not find the 2nd and 3rd example. Can someone help?

UPDATE: I want to remove those empty tags from the database. I fiddled around with MySQLs UpdateXML but have not found a solution yet.


I think the best way is to use xml functions in order to extract needed text nodes from xml document.

update table_a 
set html = UpdateXML(html, '//a', '')
where ExtractValue(html, '//a//text()') regexp '^[[:space:]]*$'

Note: if the table has a field with several a tags where some of them are empty and others are non-empty then this field will not be updated.


an extra .* at the end:

<a.*href=".*".*>.*</a>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜