How does this code from "Network programming" examples work?

2023-02-11 16:38 问答作者：

I am reading Beej's "Guide to network programming".

In one of his intro examples he talks about getting the IP address for a hostname (like google.com or yahoo.com for instance). Here is the code.

/*
** showip.c -- show IP addresses for a host given on the command line
*/

#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <arpa/inet.h>

int main(int argc, char *argv[])
{
    struct addrinfo hints, *res, *p;
    int status;
    char ipstr[INET6_ADDRSTRLEN];

    if (argc != 2) {
        fprintf(stderr,"usage: showip hostname\n");
        return 1;
    }

    memset(&hints, 0, sizeof hints);
    hints.ai_family = AF_UNSPEC; // AF_INET or AF_INET6 to force version
    hints.ai_socktype = SOCK_STREAM;

    if ((status = getaddrinfo(argv[1], NULL, &hints, &res)) != 0) {
        fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(status));
        return 2;
    }

    printf("IP addresses for %s:\n\n", argv[1]);

    for(p = res; p != NULL; p = p->ai_next) {
        void *addr;
        char *ipver;

        // get the pointer to the address itself,
        // different fields in IPv4 and IPv6:
        if (p->ai_family == AF_INET) { // IPv4
            struct sockaddr_in *ipv4 = (struct sockaddr_in *)p->ai_addr;
            addr = &(ipv4->sin_addr);
            ipver = "IPv4";
        } else { // IPv6
            struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)p->ai_addr;
            addr = &(ipv6->sin6_addr);
            ipver = "IPv6";
        }

        // convert the IP to a string and print it:
        inet_ntop(p->ai_family, addr, ipstr, sizeof ipstr);
        printf("  %s: %s\n", ipver, ipstr);
    }

    freeaddrinfo(res); // free the linked list

    return 0;
}

The part that confuses me is the for loop.

for(p = res; p != NULL; p = p->ai_next) {
    void *addr;
    char *ipver;

    // get the pointer to the address itself,
    // different fields in IPv4 and IPv6:
    if (p->ai_family == AF_INET) { // IPv4
        struct sockaddr_in *ipv4 = (struct sockaddr_in *)p->ai_addr;
        addr = &(ipv4->sin_addr);
        ipver = "IPv4";
    } else { // IPv6
        struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)p->ai_addr;
        addr = &(ipv6->sin6_addr);
        ipver = "IPv6";
    }

    // convert the IP to a string and print it:
    inet_ntop(p->ai_family, addr, ipstr, sizeof ipstr);
    printf("  %s: %s\n", ipver, ipstr);
}

Would anyone mind going through psuedo-step-by-step at whats going on or what these things are? Is it iterating through a linked list?.. I have a general idea of what the struct addrinfo are but what the heck is struct *res and struct *p or void *addr and *char ipversion.

First thing's first, do you know what a linked list is? If you understand that, you'll recognise what that for loop is going. p is a pointer to a structure that also references (links) the next structure in the list. So you're looping through a list of those structures, which are addrinfo structs. 4

Now, the thing you need to know about network packets is that they're made up of a header. Specifically the Ethernet frame. This is the hardware to hardware protocol. It let's you get things around on a physical, bounded network but knows nothing about routing across physical network boundaries.

Next up comes tcp or possibly another transport layer protocol, which sits somewhere inbetween the two levels. TCP versus UDP versus X is about how you manage the packets - for example TCP requires packets be reassembled in order, whereas UDP is a "broadcast"-type protocol.

Finally, you have the internet protocol suite (IPv4, IPv6). These are higher level protocols that control the broader sense of routing, so they know about the internet at large, but less about the steps needed to get there.

A great explanation of this is the handy diagram on this page. To complete the picture, BGP is how routers know how to move stuff around.

tcp/udp fit into this picture by being a part of (enscapulated in) the protocol in question (IPv4 for example)

So ethernet frames contain other protocols most notably IPv4, which contain the information routers need to get it out across the internet (across multiple physical networks). The internet protocol specifies where you want to go, from where you are. So a typical's IPv4 body remains unchanged across its whole transit, but every time it traverses physical networks it gets wrapped up in a different ethernet packet.

Now, in the ethernet header there is a field for finding out what the "ethernet body" contains. This line:

 if (p->ai_family == AF_INET) {

Does. AF_INET is a constant that matches the value tcp uses to identify the packet body as IPv4. So, if you're looking at an IPv4 header, this loop then goes on to read that information.

The else clause is technically wrong, because not being IPv4 doesn't automatically make it IPv6. You could change it to test for IPv6 like this:

 else if (p->ai_family == AF_INET6) {

Which you might want to do, just in case you pick up something else.

Now it's worth explaining this bit of magic:

struct sockaddr_in6 *ipv6 = (struct sockaddr_in6 *)p->ai_addr;

This basically takes the network, or raw, form of the data which appears as a sequence of bytes, and casts it (coverts it) into fields in a struct. Because you know how big the fields are going to be, this is a very quick and easy way to extract out what you need.

The last thing that needs explanation is this:

inet_ntop(p->ai_family, addr, ipstr, sizeof ipstr);

There are other ways of achieving this, specifically ntohs().

Basically network data is transmitted in big endian encoding, and in order to read it, you need (potentially) to convert the data to the encoding of your system. It could be big endian, or it could be little, it depends on your system for the most part. Have a read of the wikipedia article on endianness.

Summary: what you're looking at here is a combination of computer science structures, how networks work and C code.

Well, it's not that complicated. getaddrinfo returns a linked list of addrinfo structs (struct addrinfo **res in the manpage) where each of these structs contains information about one address available to the given interface (const char *node in the manpage).

Now, every struct is being inspected and information about the struct is being printed out. To print out either IPv4 or IPv6, the variable ipver is set accordingly. Before printing out the information, the address has to be converted from a binary form to a string. This is done by inet_ntop (*n*umber to *p*ointer).

The resulting string of inet_ntop (ipstr) and ipver are now printed out to console. Printing ipver, however, is not neccessary since you would recognize the address type from the ipstr: an IPv4 address (as we all know) gets written 192.168.1.10 whereas IPv6 addresses use colons to separate the address elements: 2001:0db8:85a3:0000:0000:8a2e:0370:7334.

Yes, res points to a linked list of addrinfo structures that represent the different IP addresses of a host. The MSDN documentation on the getaddrinfo function is pretty good. I don't know what platform you're running on, but it shouldn't be much different on other platforms.

继续阅读：c network-programming sockets

How does this code from "Network programming" examples work?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？