Get main article image with PHP
I'd like to get the main image for an article, much like Facebook does when you post a link (but without the choosing image part). The data we have to work with is the whole pages HTML as a variable. The page & URL will be different for every time this function runs.
Are there any libraries or classes that ar开发者_开发问答e particularly good at getting the main body of content, much like Instapaper that would be of any help?
you can use PHP DOM classes to parse an HTML page. it would easily allow you to grab the first image and the h1 text.
you could also get more advanced with it, like cycle through the p tags to find the first p tag with over X number of characters, and use that for the main text. or you could grab the meta tags and get the description.
there are about a million different ways you could go with this, but PHP DOM is probably what you are looking for initially.
http://us.php.net/manual/en/book.dom.php
精彩评论