Get text from an URL
I want to get the plain text (that eventually is shown to the user) from an URL. I know how to extract all the contents, but what I get is all this html stuff, hidden stuff etc.
I just wat the plain text, without layout. Not really stripped all html tags from the content, but kind of parsed, and then without the l开发者_运维问答ayout. Firstly for comparison with other text and secondly to display it.Is there any easy way to do this? (any existing code?)
Use the DOM.
First, load the resource into a WebView. You don't need to put it into a window.
Then, after it finishes loading, ask for the view's mainFrameDocument
, then ask the document for its documentElement
, then ask that for its textContent
.
You can use readability to extract the content.
I do not know if there is Obj-C version but you can use javascript one with [yourWebView stringByEvaluatingJavaScriptFromString:@"readability_js_code"]
If you are retrieving the content (html) of the page not via UIWebView (ASIHTTP or custom code), try parsing with XML Parser (NSXMLParser for example)
Hope this helps :)
精彩评论