开发者

Extract information from link, like Facebook wall

I have been developing an application that can publish content like some sort of feed. I want to add the content like a link (like on the facebook wall).

Then i want some logic to parse the link destination and extract the right text and image to create a thumbnail.

Just like Facebook does it when you are posting a link on your wall.

The extraction/crawling engine seems quite complex, but what would be the best way to approach this?

I have thought about going around Facebook api and post-and-then-get-back the item from facebook and in that way simply use their engine, but i really would开发者_Python百科 like to do this as an internal system.


AFAIK Facebook does this by using meta tags (Open Graph Protocol). You can study more at: https://developers.facebook.com/docs/opengraph/.

Basically, you should define a convention if you want to implement is internally.
Hope this helps.


I think what the facebook infrastructure does is pulls the content of the page (with an ajax call) and then takes the first paragraph (if it's a web page, description if it's a youtube video, etc.) and it allows the user to pick one of the images on the page as a thumbnail. You can just pick the first image in the markup or design your own kind of logic. Basiclly, I would go about this like designing a temporary caching engine. You get the page markup + images, use them and then discard.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜