开发者

Get text from an URL

I want to get the plain text (that eventually is shown to the user) from an URL. I know how to extract all the contents, but what I get is all this html stuff, hidden stuff etc.

I just wat the plain text, without layout. Not really stripped all html tags from the content, but kind of parsed, and then without the l开发者_运维问答ayout. Firstly for comparison with other text and secondly to display it.

Is there any easy way to do this? (any existing code?)


Use the DOM.

First, load the resource into a WebView. You don't need to put it into a window.

Then, after it finishes loading, ask for the view's mainFrameDocument, then ask the document for its documentElement, then ask that for its textContent.


You can use readability to extract the content. I do not know if there is Obj-C version but you can use javascript one with [yourWebView stringByEvaluatingJavaScriptFromString:@"readability_js_code"]

If you are retrieving the content (html) of the page not via UIWebView (ASIHTTP or custom code), try parsing with XML Parser (NSXMLParser for example)

Hope this helps :)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜