Bash - how to get a download url form the official plugin wordpress page?
I'm trying to get the download link of wordpress' plugins via bash script directly from its official age.
For instance, the akismet开发者_运维百科 plugin at http://wordpress.org/extend/plugins/akismet/
In the HTML source code we can easily recognize where the link for download is:
<div class="col-3">
<p class="button">
<a href='http://downloads.wordpress.org/plugin/akismet.2.5.3.zip'>
Download Version 2.5.3
</a>
</p>
I noticed that the words "Download Version" only appear once in the entire file, just after the download link that we want to get.
Let's say I do not know what is the download link. The question is how can filter the html code in order to get the download link (so later I can use it with wget or curl). All I know is the plugin page url. How do I filter the html code in order to extract the download link.
Thank you.
nadav@shesek:~$ curl -s https://wordpress.org/extend/plugins/akismet/ | egrep -o "https://downloads.wordpress.org/plugin/[^']+" https://downloads.wordpress.org/plugin/akismet.2.5.3.zip nadav@shesek:~$ wget `curl -s https://wordpress.org/extend/plugins/akismet/ | egrep -o "https://downloads.wordpress.org/plugin/[^']+"` --2011-08-20 16:43:33-- https://downloads.wordpress.org/plugin/akismet.2.5.3.zip Resolving downloads.wordpress.org... 72.233.56.138, 72.233.56.139 Connecting to downloads.wordpress.org|72.233.56.138|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 27714 (27K) [application/octet-stream] Saving to: `akismet.2.5.3.zip' 100%[============================================================================================================================================================>] 27,714 39.9K/s in 0.7s 2011-08-20 16:43:35 (39.9 KB/s) - `akismet.2.5.3.zip' saved [27714/27714]
Notice the -o
switch for grep, that makes it output the matched part only instead of the entire line.
You can try with following regex:
href=['"](.*?)['"]>\s*Download Version [0-9.]+
精彩评论