Extract images link from html text string
I want to extract all images link to so I can utilize all images freely. how to do in asp.net c#
<div>
<img src="/upload/Tom_Cruise-242x300.jpg" alt="Tom_Cruise-242x300.jpg" align="left" border="0" height="300" width="242">
sample text sample text sample text sample text
<img src="http://www.sharicons.com/images/rss_icon.jpg" alt="Icon" align="left" border="0" height="100" width="100">
sample text sample text sample text sample text sample text sample text sample text sample text</div>
I Got the Solutions
string ProcessedText = Regex.Replace(sb.ToString(开发者_运维百科), "^<img[^>]*>", string.Empty);
You can use the HTML Agility Pack to parse the HTML and query it using XPath syntax (like XmlDocument
).
I would use the HTML Agility Pack.
Then you can do something like this:
HtmlNodeCollection allImages = doc.DocumentNode.SelectNodes("//img[@src]");
One easy way to do this is to put the string into a string called myString
, then run the following code:
List<string> imagePaths = new List<string>();
while( myString.IndexOf("img src=") >= 0 ){
myString = myString.Substring( myString.IndexOf("img src=")+9);
imagePaths.Add(myString.Substring(0,myString.IndexOf("\"")));
}
The List imagePaths
will now contain all the image links.
You can use HTMLAgilityPack or your second option is regular expressions :)
精彩评论