开发者

removing multiple tags in SGML

i have a sgml file like

<p><p><data>sdlksdskdmskdmsamdakmdksam<p></data>...

my q开发者_C百科uestion is how to remove one tag <p> and keep another one intact ...which regular expression would be siutable......


If your SGML is such it can be processed as XML, then XProc is a good technology for this kind of thing, with a single step such as:

<p:unwrap match="p[parent::p]"/>

(Assuming you want to remove all self-nested p elements until p never wraps itself).

You definitely do not want to process SGML/XML with regexps unless you are 100% certain you will be dealing with a subset which has a certain well-specified lexical form. Think for example how you'd process stuff with comments using a regexp:

<p><!-- <p> commented out--><foo><p/><p/></foo></p>

!!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜