开发者

Find and Copy a String within HTML code

I'm trying something new, I would normally do this in C# or VB. But for speed reason I'd like to do this on my server.

  1. Open File terms.txt
  2. Take each item one at a time from terms.txt and open a url (possibly curl or something else) and go to http://system.com/set=terms
  3. View the HTML source and extract pic names (stringB). Look for image=StringB&location
  4. Save StringB to imgname.txt
  5. Close file and cycle to the next item in terms.txt

I was looking at sed but I believe awk might be the best way? This is all new to me building a开发者_C百科 command like this to run under shell. I'm familiar with using linux just need help with the commands.


Something not entirely unlike this should do ya, depending on the precise format of terms.txt (shell scripts cope best with one entry per line) and whether you actually need to parse the HTML (I'm hoping you don't):

#! /bin/sh

if [ $# -ne 2 ]; then
    echo "usage: $0 termfile baseurl" >&2
    exit 1
fi
termfile="$1"
baseurl="$2"

while read term; do
    wget -q -O- "$baseurl/set=$term" |
      sed -ne 's/^.*image=\([^&]*\)&.*$/\1/p'
done < "$termfile"

You save this to a file named "extractimages", chmod +x it, and run it like so:

$ ./extractimages terms.txt http://system.com > imgname.txt


sed 's|^.*$|wget -q -O- http:\/\/system.com/set=&|' file | bash |sed -ne 's/^.*image=\([^&]*\)&.*$/\1/p' 
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜