Extract all Images from HTML whose width or height higher than a specified value - Regex
I'm trying to make a smal开发者_运维技巧l link share function with Classic ASP like LinkedIn or Facebook.
What I need to do is to get HTML of remote URL and extract all the images whose width are greater than 50px for example.
I can crawl and take the HTML and also I can find the images with this regex:
<img([^<>+]*)>
It matches; <img src="/images/icon.jpg" width="60" height="90" style="display:none"/>
Then I'm able to extract the path but sometimes it matches <img src="/track.php" style="display:none" width="1" height="1"/>
which is not a real image.
Anyway, I feel like you are gonna be mad because of classic ASP but my company ....
I know there are lots of topics about this issue and mostly, they recommend not to USE regex but I couldn't find a way to this with classic asp. Is there a component or something to this?
Regards
This will get you close:
<img [^>]*width="(0?[1-9]\d{2,}|[5-9]\d)"[^>]*>
It accepts image tags with a width of 50 or greater.
Edit: tags with unspecified widths:
<img [^>]*width="(0?[1-9]\d{2,}|[5-9]\d)"[^>]*>|<img ((?!width=)[^>])*>
精彩评论