How does this code from "Network programming" examples work?
I am reading Beej's "Guide to network programming".
开发者_开发问答In one of his intro examples he talks about getting the IP address for a hostname (like google.com or yahoo.com for instance). Here is the code.
/*
** showip.c -- show IP addresses for a host given on the command line
*/
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <arpa/inet.h>
int main(int argc, char *argv[])
{
struct addrinfo hints, *res, *p;
int status;
char ipstr[INET6_ADDRSTRLEN];
if (argc != 2) {
fprintf(stderr,"usage: showip hostname\n");
return 1;
}
memset(&hints, 0, sizeof hints);
hints.ai_family = AF_UNSPEC; // AF_INET or AF_INET6 to force version
hints.ai_socktype = SOCK_STREAM;
if ((status = getaddrinfo(argv[1], NULL, &hints, &res)) != 0) {
fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(status));
return 2;
}
printf("IP addresses for %s:\n\n", argv[1]);
for(p = res; p != NULL; p = p->ai_next) {
void *addr;
char *ipver;
// get the pointer to the address itself,
// different fields in IPv4 and IPv6:
if (p->ai_family == AF_INET) { // IPv4
struct sockaddr_in *ipv4 = (struct sockaddr_in *)p->ai_addr;
addr = &(ipv4->sin_addr);
ipver = "IPv4";
} else { // IPv6
struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)p->ai_addr;
addr = &(ipv6->sin6_addr);
ipver = "IPv6";
}
// convert the IP to a string and print it:
inet_ntop(p->ai_family, addr, ipstr, sizeof ipstr);
printf(" %s: %s\n", ipver, ipstr);
}
freeaddrinfo(res); // free the linked list
return 0;
}
The part that confuses me is the for loop.
for(p = res; p != NULL; p = p->ai_next) {
void *addr;
char *ipver;
// get the pointer to the address itself,
// different fields in IPv4 and IPv6:
if (p->ai_family == AF_INET) { // IPv4
struct sockaddr_in *ipv4 = (struct sockaddr_in *)p->ai_addr;
addr = &(ipv4->sin_addr);
ipver = "IPv4";
} else { // IPv6
struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)p->ai_addr;
addr = &(ipv6->sin6_addr);
ipver = "IPv6";
}
// convert the IP to a string and print it:
inet_ntop(p->ai_family, addr, ipstr, sizeof ipstr);
printf(" %s: %s\n", ipver, ipstr);
}
Would anyone mind going through psuedo-step-by-step at whats going on or what these things are? Is it iterating through a linked list?.. I have a general idea of what the struct addrinfo
are but what the heck is struct *res
and struct *p
or void *addr
and *char ipversion
.
First thing's first, do you know what a linked list is? If you understand that, you'll recognise what that for loop is going. p
is a pointer to a structure that also references (links) the next structure in the list. So you're looping through a list of those structures, which are addrinfo
structs. 4
Now, the thing you need to know about network packets is that they're made up of a header. Specifically the Ethernet frame. This is the hardware to hardware protocol. It let's you get things around on a physical, bounded network but knows nothing about routing across physical network boundaries.
Next up comes tcp or possibly another transport layer protocol, which sits somewhere inbetween the two levels. TCP versus UDP versus X is about how you manage the packets - for example TCP requires packets be reassembled in order, whereas UDP is a "broadcast"-type protocol.
Finally, you have the internet protocol suite (IPv4, IPv6). These are higher level protocols that control the broader sense of routing, so they know about the internet at large, but less about the steps needed to get there.
A great explanation of this is the handy diagram on this page. To complete the picture, BGP is how routers know how to move stuff around.
tcp/udp fit into this picture by being a part of (enscapulated in) the protocol in question (IPv4 for example)
So ethernet frames contain other protocols most notably IPv4, which contain the information routers need to get it out across the internet (across multiple physical networks). The internet protocol specifies where you want to go, from where you are. So a typical's IPv4 body remains unchanged across its whole transit, but every time it traverses physical networks it gets wrapped up in a different ethernet packet.
Now, in the ethernet header there is a field for finding out what the "ethernet body" contains. This line:
if (p->ai_family == AF_INET) {
Does. AF_INET
is a constant that matches the value tcp uses to identify the packet body as IPv4. So, if you're looking at an IPv4 header, this loop then goes on to read that information.
The else clause is technically wrong, because not being IPv4 doesn't automatically make it IPv6. You could change it to test for IPv6 like this:
else if (p->ai_family == AF_INET6) {
Which you might want to do, just in case you pick up something else.
Now it's worth explaining this bit of magic:
struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)p->ai_addr;
This basically takes the network, or raw, form of the data which appears as a sequence of bytes, and casts it (coverts it) into fields in a struct. Because you know how big the fields are going to be, this is a very quick and easy way to extract out what you need.
The last thing that needs explanation is this:
inet_ntop(p->ai_family, addr, ipstr, sizeof ipstr);
There are other ways of achieving this, specifically ntohs()
.
Basically network data is transmitted in big endian encoding, and in order to read it, you need (potentially) to convert the data to the encoding of your system. It could be big endian, or it could be little, it depends on your system for the most part. Have a read of the wikipedia article on endianness.
Summary: what you're looking at here is a combination of computer science structures, how networks work and C code.
Well, it's not that complicated. getaddrinfo
returns a linked list of addrinfo
structs (struct addrinfo **res
in the manpage) where each of these structs contains information about one address available to the given interface (const char *node
in the manpage).
Now, every struct is being inspected and information about the struct is being printed out. To print out either IPv4 or IPv6, the variable ipver
is set accordingly. Before printing out the information, the address has to be converted from a binary form to a string. This is done by inet_ntop
(*n*umber to *p*ointer).
The resulting string of inet_ntop
(ipstr
) and ipver
are now printed out to console. Printing ipver
, however, is not neccessary since you would recognize the address type from the ipstr
: an IPv4 address (as we all know) gets written 192.168.1.10
whereas IPv6 addresses use colons to separate the address elements: 2001:0db8:85a3:0000:0000:8a2e:0370:7334
.
Yes, res
points to a linked list of addrinfo
structures that represent the different IP addresses of a host. The MSDN documentation on the getaddrinfo function is pretty good. I don't know what platform you're running on, but it shouldn't be much different on other platforms.
精彩评论