Question about port numbers in computer networks
Based on my understanding, port numbers are just like telephone extensions. Just as a business telephone switchboard can use a main phone number and assign each employee an extension number (like x100, x101, etc.), so a computer has a main address and a set of port numbers to handle incoming and outgoing connections.
But the question is:
- On what basis is a port number assigned? A process or an a开发者_StackOverflow社区pplication?
Based on my experience with firewall, I usually open a port for a specific application. So port number should be assigned on an application's basis. But what if there're multiple instances of the same application running on a single machine. Each of the instances uses the same port number. So if a message is arrived at that port number, how could the system tell which instance should the message go?
And another question also related to port.
If a web server is setup to listen on port 80, client browser should always contact the 80 port. I am not sure if the following illustration of the communication between a web browser and the web server is correct.
Client Browser sent request to Server, the message should contain info like this:
To: < ServerAddress:80 >
From: < ClientAddress:XXX >
Server sent reponse to Client Browser like this:
To: < ClientAddress:XXX >
From: < ServerAddress:80 >
So the question is, will the server pick other port numbers for sending messages to client? Because I think a single 80 port doesn't look enough.
Add - 1 - 21:16 2010/12/19
In my above post, the word "application" represents a static program file that the system knows. Multiple instnaces of this application could be launched, which are multiple "processes"
Each client connection will be represented by a socket on the server. Sockets are uniquely represented by the combination of the following 4 pieces of information:
- Peer IP address
- Peer port
- Local IP address
- Local port
The client chooses a random port, so if there are multiple connections from one client to the same server/port, the connections will still differ by the client's port.
If there are multiple web server applications running on the same server, they will have to listen on different ports or the server will need to have multiple IP addresses.
On a computer, only one process can be listening on a specific port number. For example, if an Apache process is listening on port 80, no other application can also listen on port 80.
Apache usually pre-forks several processes, only one of those is listening on port 80. The job of that process is to hand over the processing for any connection to one of the pool of other Apache processes as quickly and efficiently as it can.
Each of many concurrent connections to port 80 is distinguished by it's source IP-address and by the source TCP port number (which the client computer chooses randomly from the set not in use).
(Edit)
I was pretty sure that webservers have one process (or thread) listening which accepts incoming connections and passes corresponding filehandles to the worker processes (or threads). EJP advises that this is not so.
Apache seems to have several different multi-processing modules that affect how it spreads the load of responding to multiple concurrent requests. For example: MPM Prefork and MPM Worker
Jeff Pozkaner wrote an overview of HTTP server design that I found interesting:
The basic operation of a web server is to accept a request and send back a response. The first web servers were probably written to do exactly that. Their users no doubt noticed very quickly that while the server was sending a response to someone else, they couldn't get their own requests serviced. There would have been long annoying pauses.
The second generation of web servers addressed this problem by forking off a child process for each request. …
A slight variant of this type of server uses "lightweight processes" or "threads" instead of full-blown Unix processes. …
The third generation of servers is called "pre-forking". Instead of starting a new subprocess for each request, they have a pool of subprocesses that they keep around and re-use. …
The fourth generation. One process only. No non-portable threads/LWPs. Sends multiple files concurrently using non-blocking I/O, calling select()/poll()/kqueue() to tell which ones are ready for more data. …
Network stack distinguishes TCP connections by triple <source IP,source port,destination port>, so knowing client address and port is enough to work correctly.
What is the application, if it is not a process? In firewalls you open ports for executables. It may be considered as an application, and it is a process when it is running.
Multiple listeners cannot listen to the same port. The same process can listen to multiple ports.
Ports are assigned to the listeners. Depending on the firewall (and its configuration) you can allow the process (executable) to listen several ports, or to create several exceptions for the same process listening to multiple ports.
I'm not sure what you mean by the difference between a "process" and an "application". Everything is just code executing on your box.
Anyway, a process/application will listen/bind to whatever port number the authors of the application have configured. By convention, many port numbers are reserved for particular types of application - that is applications which communicate using a particular protocol. So for example web servers which use HTTP typically run on port 80. SMTP servers run on port 22. HTTPS is 443 and so on.
Of course you can configure your web server (e.g apache httpd) to run on whatever port you like - but your client needs to know else it will assume port 80.
Two processes/applications may not bind to the same port. If you try to start another process/application on a port already in use you'll get an error: cannot bind to port or something to that effect.
will the server pick other port numbers for sending messages to client?
No. All the accepted sockets use the same server-side port number as the original listening socket. The identifying tuple mentioned above disambiguates this so as to make each connection unique.
精彩评论