开发者

Objective C: Stripping HTML attributes from a string

There are a lot of answers for stripping HTML tags from a string, but I'd like to strip only a specific attribute: style. The HTML that I'm dealing with has some seriously nasty inline 开发者_如何转开发styles, and often looks something like this:

<p class="someclass" style="margin-left:2cm;text-indent:-36.0pt">Blah.</p>

In order to adjust the display for my application, I need to strip that style attribute. Is there a fast way to process the document to do this? It needs to work in iOS.

Thanks!


Use an XSLT transformation. See http://developer.apple.com/library/mac/documentation/cocoa/Conceptual/NSXML_Concepts/Articles/WritingXML.html#//apple_ref/doc/uid/TP40001256-112639


Ultimately, I went with a combination of ElementParser and regular expressions (using RegExKitLite), and stripping out the tags I didn't want and replacing them with ones I did, as required. Given that my HTML is coming from a trusted source, this should be fine.

It's far from ideal, but it's working. :-)


Well probably the simplest (but also quite expensive (CPU intensive)) is to use NSAttributedString+HMTL to turn it into an NSAttributedString. Then you can get the NSString from that.

Something like this.

  NSAttributedString *attrstring = [NSAttributedString attributedStringWithHTML:[htmlString dataUsingEncoding:NSASCIIStringEncoding allowLossyConversion:YES] options:nil];

  //Access the string itself like this.
  [attrstring string];

[Warning: although this is the easiest way, (for you), it might not be the best way as it is quite expensive todo and will block your UI if done on the main thread (for obvious reasons)]

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜