开发者

sed one-liner to delete all digits 0-9 that happen after a period

I have

sed -e '/^ *[0-9]\+ *$/d' <oldtextfile >newtextfile

...which I use on text I have copied and pasted from PDFs to remove page numbers. However, I also need to remove footnote numbers, so I need to modify the above sed one-liner to do that by deleting any digits that happen after a period, and unfortunately I have very little patience for sed. Can so开发者_C百科meone help me out?


sed 's/\.[0-9]*/./g'

That probably doesn't do what you want to do, so tell me more precisely what you want to do.


On Windows anyway, sed needs an escape to recognize + as a modifier - \+. I've fought that a bunch of times and only discovered it now from here: http://www.gnu.org/software/sed/manual/sed.html#Regular-Expressions

So you could then use geofftnz's solution as:

C:\Users\Me>cat test.txt | sed "s/\.[0-9]\+//g"


I'm on windows, with some version of sed that may not be entirely standard, but this is what I did:

cat test.txt | sed "s/\.[0-9][0-9]*//g"

(My sed didn't recognise a + for regex)

C:\Users\Me>cat test.txt
Hello, this is a file
with some .2346 stuff I want to remove.

.this stuff I dont.

What about some more: .99123how's that?

Normal number: 1234

C:\Users\Me>cat test.txt | sed "s/\.[0-9][0-9]*//g"
Hello, this is a file
with some  stuff I want to remove.

.this stuff I dont.

What about some more: how's that?

Normal number: 1234


Since you didn't give any example input, I'll have to assume worst case scenario which is where you have numbers embedded between letters and you want to keep the letters.

Example: foo123.bar465baz789qux

In that case I think awk would be the better tool

awk -F'.' '{gsub("[[:digit:]]","",$2)}1' OFS='.' oldtextfile > newtextfile

Proof of Concept

$ echo "foo123.bar456baz789qux" | awk -F'.' '{gsub("[[:digit:]]","",$2)}1' OFS='.'
foo123.barbazqux


I know this is a million years old but a really short answer is

cat yourfile.txt| |tr -d "[0-9]" > newfile.txt
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜