Finding line beginning using regular expression

2022-12-27 10:12 问答作者：

Finding Line Beginning using Regular expression in Notepad++

I want to strip a 4000-line HTML file from all the jQuery "done" attributes in a div.

<DIV class=menu done27="1" done26="0"
done9="1" done8="0" done7="1"
done6="0" done4="20">

should be replaced with:

<DIV class=menu>

In this experiment I can do it with this regular expression:

[ ^]done[0-9]+="开发者_JAVA技巧;[0-9]+"

Using Notepad++ 5.6.8 Unicode, with a file encoded in ANSI, I'm putting this regex in the "Find what" field. It only replaces the 5 occurrences starting with a space, it will miss the 2 occurrences starting at the beginning of a line.

How can I construct a regex to remove all the attributes of an HTML element starting with a keyword?

Extended Replace "\n" with "LINEBREAK "

Thanks a lot to all for these timely replies. Following your advices, here's what I did:

"Notepad++ > View > Show Symbol > Show End Of Line" shows "CR+LF" at each line end.
"Notepad++ > Search > Find", "Search mode" = "Normal", made sure that "Find what" = "LINEBREAK" finds nothing
"Search mode" = "Extended", "Find what" = "\n\r" only finds the double-breaks (CR + LF + a blank line); "\n \r" find nothing; yet "\n" does find exactly all line breaks, and only them.
Saving my "Towncar.htm" test file as "Towncar_02.htm" (also encoded in ANSI)
Under "Extended", replaced all "\n" with "LINEBREAK " (notice the trailing space)
Under "Regular expression", replaced each occurrence of:
```
 done[0-9]*="[0-9]*"
```

(Be careful to check there is THE HEADING SPACE before "done"
and there is NO TRAILING SPACE! see below)

with an empty string

Under "Extended", replaced each occurrence of "LINEBREAK" with "\n" (no trailing space this time after "LINEBREAK"!)
Checked that the resulting "Towncar.htm" file (after a few cosmetic reformatting) looked OK and pretty, and that after refresh, it still rendered the same as the "Towncar_02.htm" backup.

Recalls and Notes:

This forum apparently works well in Chrome 4; but with some browsers (e.g. IE6 and other discontinued ones), under some circumstances, it causes some artifacts; so, be careful:
even if the forum doesn't show it in your browser, there is a heading space, i.e. at the beginning of the Regex (the " done..." Regular expression above) and inside it, so to replace only strings starting with " done", with the starting space, thus making even surer to NOT alter eventual other strings with "undone" or "methadone" or else
same way, even if the forum shows one in your browser, there is no trailing space at the end of the Regex!
in the Regex, [0-9] matches 1 and only 1 occurrence of any decimal digit (characters in the 0-9 range); IOW it matches « 0 » or « 1 » or « 9 » etc, but NOT « 01 » or « 835 » or « » (the empty string) or whichever.
* (asterisk) matches 0 or more times the previous character (here it matches the empty string or any string made exclusively of digits)
samewise, + (plus sign) matches 1 or more times the previous character (here it matches any string, at least 1 character long, made exclusively of digits)
Ref: http://sourceforge.net/apps/mediawiki/notepad-plus/index.php?title=Regular_Expressions#Notepad.2B.2B_regex_syntax

I like Notepad++ too but the regexing is really a pain. If you insist on using Notepad++ try this:

First find out which newline characters are being used in your document (View>Show Symbol>Show End Of Line)
Delete those line-breaks by replacing them with a single space (Search and replace. CR is \r LF is \n. Be sure to tick "Extended" search mode)
Regex-replace done[0-9][0-9]*=\"[0-9][0-9]*\" with the empty string (be sure to put a single space before the regex expression)

Voila! Not very nice n clean but it works ;o)

After that if you want it human-readable again you could use the HTMLTidy functions

You almost had it! Unfortunately, the complete solution in Notepad++ would have to be a 3 step process.

Regex search/replace with the following search: \<done[0-9]+="[0-9]+"[ ]* Of course, leave the replace field empty, so that it will simply delete everything that matches. (In Notepad++ understanding of regular expressions \< represents the "beginning of a word".)
Select the portion of text affected by your previous search/replace. You don't want to select the entirety of your document, because we're going to...
Strip newlines. Hit Ctrl-F to bring up the Search/Replace dialog again and this time select "Extended" search mode, instead of "Regular expression". Depending on the format of your document you are going to want to search for either \n or \r\n. The replacement field should, again, be empty. Also, make sure that the "In Selection" checkbox is checked.

Click "Replace All" and you're done!

A simple way is:

Goto "Search" and "Replace"
Input "\n" in "Find what"
Input your string in "Replace with"
Select "Extended" in "Search Mode"
Click "Replace All"

It will plug your string at the beginning of each line except the first line.

I'm afraid, Notepad++ Regex cannot do that

Notepad++ using Scintilla regex engine, its per line based, so multiline search / replace cannot be done.

Note that \r and \n are never matched because in Scintilla, regular expression searches are made line per line (stripped of end-of-line chars).

Quoted from http://www.scintilla.org/SciTERegEx.html

继续阅读：notepad++regex

Finding line beginning using regular expression

Finding Line Beginning using Regular expression in Notepad++

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Finding Line Beginning using Regular expression in Notepad++

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？