开发者

C: how to read a webpage

I'm trying to open a connec开发者_开发百科tion to a webpage (e.g. www.google.com) via localhost, port 80.

How can I do this programatically in C? I want get all the HTML headers and not just the content ;(

I hope someone can help.

Many thanks in advance,


Here is some example code on how to do this with libcurl:

http://curl.haxx.se/libcurl/c/getinmemory.html

There is another one right there, that shows you how to get some header data:

http://curl.haxx.se/libcurl/c/getinfo.html

These examples and many others are available as part of the libcurl distribution. It should more than get you started.


Summarized process:

  • DNS resolution for the hostname (using getaddrinfo())
  • Open a stream socket (TCP) to the resolved IP address and port
  • Send GET request (see protocol in: http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol)

    GET /index.html HTTP/1.1 Host: www.example.com

  • Read headers - Terminated by \r\n\r\n

  • Read body
  • Close socket


Minimal runnable POSIX example

In this answer, I provide a minimal runnable POSIX C example: How to make an HTTP get request in C without libcurl?

It allows you to do:

./wget example.com

to download http://example.com

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜