开发者

Regex to replace html whitespace and leading whitespace in notepad++

I have tried to use the following regex expression to remove html whitespace and leading whitespace

Find:   \s*([<>])\s*

Replace: $1

But each time that I do this I end up with 186 occurrences of $1 literaly in my document. Any assistance would be greatly appreciated

Here is an example of what I am talking about

This

<fieldset id="prod_desc">
<p>Original AA </p>
<b>Features:</b> 
<ul>
  <li>2 pole rectangular dome tent with 13.4 sq ft of vestibule storage </li>
  <li>Durable, shockcorded, self-supporting fiberglass frame and ring and pin/pole pocket assembly </li>
  <li>2 side opening door panels are constructed entirely of no see-um mesh to maximize air flow inside </li>
  <li>Poke-out vent in side wall allows the option of additional ventilation when needed </li>
  <li>2 interior storage pockets keep es开发者_运维百科sential items handy Specifications: </li>
  <li>Season: 3 </li>
  <li>Sleeps: 2 </li>
  <li>Doors: 2 </li>
  <li>Windows: 2 </li>
  <li>Weight: 5 lbs 12 oz </li>
  <li>Area: 36.5 Sq. Ft. </li>
  <li>Center Height: 3' 7.5&quot;</li>
</ul>
</fieldset> 

should become:

<fieldset id="prod_desc"><p>Original AA</p><b>Features:</b><ul><li>2 pole rectangular dome tent with 13.4 sq ft of vestibule storage</li><li>Durable, shockcorded, self-supporting fiberglass frame and ring and pin/pole pocket assembly</li><li>2 side opening door panels are constructed entirely of no see-um mesh to maximize air flow inside</li><li>Poke-out vent in side wall allows the option of additional ventilation when needed</li><li>2 interior storage pockets keep essential items handy Specifications:</li><li>Season: 3</li><li>Sleeps: 2</li><li>Doors: 2</li><li>Windows: 2</li><li>Weight: 5 lbs 12 oz</li><li>Area: 36.5 Sq. Ft.</li><li>Center Height: 3' 7.5&quot;</li></ul></fieldset>


Notepad++ doesn't support $1 for backreferences before version 6.0 when it introduced PCRE support for find-and-replace. For older versions, use \1 for backreferences.

You should be finding \s*(<[^>]+>)\s*. As of Notepad++ version 6.0, released in March 2012, this alone should work for you. I tried your original regex and it works as well, much to my surprise.

Previous versions cannot do multi-line regex replacements. To strip newlines, perform the regex replacement first, then do an extended find (UNIX line endings):

\n

For Windows line endings:

\r\n

Replace either case with nothing.


You could use the expression \s+\<(.*)\>\s+ and replace with $1 (or \1 in Notepad++)

Or you could use this approach:

  • first, match \s+\< and replace with <
  • second, match \>\s+ and replace with >
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜