Remove Repeating and Control Characters in sed

2023-04-04 02:41 问答作者：

Let's say I have a word at the beginning of a line, HHEELLLLOO for example. How can I replace repeat characters with single characters. The output should be HELLO.

Also does anyone know how to remove or specify control characters in sed, ^H for 开发者_开发知识库example.

Question 1

Yes, regex can handle that. In sed:

$ echo HHEELLLLOO | sed 's/\(.\)\1/\1/g'
HELLO

This will do it.

Question 2

It may vary depending on your system. Here (BSD) you can type ctrl-v ctrl-h to insert a literal backspace character to be interpreted by sed. Give it a try.

$ cat file
H^HE^HL^HL^HO^H
$ sed 's/^H//g' file > new_file
$ cat new_file
HELLO

See "limiting repetition" from this site: http://www.regular-expressions.info/repeat.html

An actual script, as inspired by chown and that site:

sed 's/\([a-zA-Z]\)\1\+/\1/g'

However, you won't be able to get HELLO, you would only get HELO. A regex is not sophisticated enough to determine that there should be 2 L's. For that, you would need to match the word to a dictionary. Though, you could use the regex for that ... H+E+L+O+ . . .

For the control characters, \0xx will match arbitrary ASCII characters. You'll have to look up what ^H represents.

Try this for removing duplicates: sed 's/\([a-zA-Z]\)\1\+/\1/g' but it will produce 'HELO' not 'HELLO'. See the other Answer for the reasons why this is.

$ echo BookKeeper | perl -pe 's/(.)\1+/$1/gi'
Bokeper

$ perl -le 'print "\cSome \cEvil \cControl \cMess\c?"' | perl -ple 's/\pC//g'
ome vil ontrol ess

Technically, control characters are \p{Cc}.

继续阅读：regex sed text-editor

Remove Repeating and Control Characters in sed

Question 1

Question 2

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Question 1

Question 2

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？