removing multiple tags in SGML
i have a sgml file like
<p><p><data>sdlksdskdmskdmsamdakmdksam<p></data>...
my q开发者_C百科uestion is how to remove one tag <p>
and keep another one intact ...which regular expression would be siutable......
If your SGML is such it can be processed as XML, then XProc is a good technology for this kind of thing, with a single step such as:
<p:unwrap match="p[parent::p]"/>
(Assuming you want to remove all self-nested p elements until p never wraps itself).
You definitely do not want to process SGML/XML with regexps unless you are 100% certain you will be dealing with a subset which has a certain well-specified lexical form. Think for example how you'd process stuff with comments using a regexp:
<p><!-- <p> commented out--><foo><p/><p/></foo></p>
!!
精彩评论