开发者

Sed removing duplicate characters and certain characters in beginning/end of string

I am asking for your help with sed. I need to remove duplicate underscores and underscores from beginning and end of string.

For example:

echo '[Lorem] ~ ipsum *dolor* sit metus !!!' | sed 's/[^ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789._()-]/_/g'

Produces: _Lorem____ipsum__dolor__sit_metus____

But I need to further format this string to: Lorem_ipsum_dolor_sit_metus

In other words, remove any underscores from beginning and end of string, and reduce multipl开发者_高级运维e consecutive underscore symbols into just one, preferably using another pipes.

Do you have any idea how to do that?

Thank you.


Just add ;s/__*/_/g;s/^_//;s/_$// just after g in your sed command.


All you need to do is add a "+" after your bracket expression to eliminate runs of multiple underscores. Then you can delete the beginning and ending ones. Also, as ladenedge suggested, you can use a character class to shorten your list.

sed 's/[^[:alnum:].()-]\+/_/g;s/^_\(.*\)_$/\1/'
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜