开发者

Extract all Images from HTML whose width or height higher than a specified value - Regex

I'm trying to make a smal开发者_运维技巧l link share function with Classic ASP like LinkedIn or Facebook.

What I need to do is to get HTML of remote URL and extract all the images whose width are greater than 50px for example.

I can crawl and take the HTML and also I can find the images with this regex:

<img([^<>+]*)>

It matches; <img src="/images/icon.jpg" width="60" height="90" style="display:none"/>

Then I'm able to extract the path but sometimes it matches <img src="/track.php" style="display:none" width="1" height="1"/> which is not a real image.

Anyway, I feel like you are gonna be mad because of classic ASP but my company ....

I know there are lots of topics about this issue and mostly, they recommend not to USE regex but I couldn't find a way to this with classic asp. Is there a component or something to this?

Regards


This will get you close:

<img [^>]*width="(0?[1-9]\d{2,}|[5-9]\d)"[^>]*>

It accepts image tags with a width of 50 or greater.

Edit: tags with unspecified widths:

<img [^>]*width="(0?[1-9]\d{2,}|[5-9]\d)"[^>]*>|<img ((?!width=)[^>])*>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜