Unix shell bash 'one-liner' to isolate all parentheses containing a URL that includes ".mp3"

2023-01-30 07:22 问答作者：

I'm completely new to this Unix bash stuff — and first question here! Hope you guys can help:)

Problem:

I have a mass of messy web source code (wrapping/unformatted) containing multiple occurrences of:

('http://www.example.com/path/audio.mp3')

Could you please help with a one-liner (sed/awk...) that will isolate these occurrences of parentheses containing a URL that includes ".mp3", clean leading/trailing "()" and " ' " characters, and then print as list (one per line) to an active .txt file.

Note: The one-liner will be used in Automator on Mac as a开发者_运维问答 service/workflow to action on 'selected text.'

Any help would be greatly appreciated as (despite trawling through all the online tuts) I'm completely lost.

Best Regards,

Dave

Using egrep with -o (output only the parts that match) should do the trick. Try something like this:

egrep -o "http://[^'\"]+.mp3" FILENAME

PERL, which Mac should have.

#!/usr/bin/perl
while(<STDIN>)
{
    $_ =~ /.*(http:\/\/.*\.mp3).*/;
    print $1 . '\n';
}

Try to refine the following:

perl -ne $'while(/\(\'(http:\/\/[\w.\/]+?\.mp3)\'\)/g) { print "$1\n"; }' < input_file > output_file

It read stdin (here: input_file) one line at a time, looks for every occurrence of a "url" in that line and prints it to stdout (here: output_file) without (' and ').

~~awk '{print $2}' FS="('|')" < filename~~

cat filename | tr ')' '\n' | awk '{print $2}' FS="('|')" > output.txt

Just replace filename with the name of your file containing these lines..

echo "your multiline\
text here" | tr ')' '\n' | awk '{print $2}' FS="('|')"

JUST A TRY:

tr ')' '\n' | awk '{print $2}' FS="('|')"

This will match the URLs that appear within parentheses and single quotes:

grep -Po "(?<=\(')http.*?mp3(?='\))"

The URLs are output, one per line, without the parentheses or single quotes. The -P option for Perl-compatible regular expressions is available (at least) in GNU and OS X grep versions.

继续阅读：automator shell text-processing

Unix shell bash 'one-liner' to isolate all parentheses containing a URL that includes ".mp3"

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？