Get String Instead of Source - Xcode Cocoa
I have a program that will scan the contents of a website, and display it in a textbox. The problem is that it shows the html source. For example if my html code was:
<html>
<body>
<p>Hello</p>
</body>
</html>
instead of just showing hello,
it'll show the code above...
How can I get my 开发者_如何转开发objective c program to just read the hello, and not the html source.. I was assuming that it was the encoding when reading the website, but I might be possibly wrong..
I would greatly appreciate it if someone could give me a reasonable answer..
Best Regards,
Kevin
If you want to display a web page, use WebKit. If you want to strip xml tags, use NSXMLParser. Some html is valid xml, but it depends. HTML is just text unless you use something designed to parse it.
As far as I know there is nothing built into cocoa to do this. You would have to implement your own HTML parser to read the code and spit out text. I would do this by either searching for other implementation online and adapting them for cocoa as it would give you lots of experience with the language or you could trial and error and learn some regular expressions. This particular library is for Java, but it should be an easy port to cocoa/c http://htmlparser.sourceforge.net/
Apparently you can 'tidy up' the html and then use an XML parser http://tidy.sourceforge.net/ There is however an XML parser(HTML is a subset) and you could use it to get the information that you want from it. http://expatobjc.sourceforge.net/
If it twas me, I would write a script on a web server in say, php, that handles parsing out the text in a web page. php has a bunch of built-in functions like strip_tags()
that handle removing html tags from a string.
So all the heavy lifting would be done in the php script. Then your iPhone app (assuming it's for iphone per your tags) will just POST the URL you want to parse to your php script, which then returns the text to you.
Just use regex to strip the tags, do a google search you can find the answer
精彩评论