I can't understand polling/select in python
I'm doing some threaded asynchronous networking experiment in python, using UDP.
I'd like to understand polling and the select python module, I've never used them in C/C++.
What are those for ? I kind of understand a little select, but does it block while watching a resource ? What is the pur开发者_Python百科pose of polling ?
Okay, one question a time.
What are those for?
Here is a simple socket server skeleton:
s_sock = socket.socket()
s_sock.bind()
s_sock.listen()
while True:
c_sock, c_addr = s_sock.accept()
process_client_sock(c_sock, c_addr)
Server will loop and accept connection from a client, then call its process function to communicate with client socket. There is a problem here: process_client_sock
might takes a long time, or even contains a loop(which is often the case).
def process_client_sock(c_sock, c_addr):
while True:
receive_or_send_data(c_sock)
In which case, the server is unable to accept any more connections.
A simple solution would be using multi-process or multi-thread, just create a new thread to deal with request, while the main loop keeps listening on new connections.
s_sock = socket.socket()
s_sock.bind()
s_sock.listen()
while True:
c_sock, c_addr = s_sock.accept()
thread = Thread(target=process_client_sock, args=(c_sock, c_addr))
thread.start()
This works of course, but not well enough considering performance. Because new process/thread takes extra CPU and memory, not idle for servers might get thousands connections.
So select
and poll
system calls tries to solve this problem. You give select
a set of file descriptors and tell it to notify you if any fd is ready to read/write/ or exception happens.
does it(select) block while watching a resource?
Yes, or no depends on the parameter you passed to it.
As select man page says, it will get struct timeval
parameter
int select(int nfds, fd_set *readfds, fd_set *writefds,
fd_set *exceptfds, struct timeval *timeout);
struct timeval {
long tv_sec; /* seconds */
long tv_usec; /* microseconds */
};
There are three cases:
timeout.tv_sec == 0 and timeout.tv_usec = 0
No-blocking, return immediately
timeout == NULL
block forever until a file descriptor is ready.
timeout is normal
wait for certain time, if still no file descriptor is available, timeout and return.
What is the purpose of polling ?
Put it into simple words: polling frees CPU for other works when waiting for IO.
This is based on the simple facts that
- CPU is way more faster than IO
- waiting for IO is a waste of time, because for the most time, CPU will be idle
Hope it helps.
If you do read
or recv
, you're waiting on only one connection. If you have multiple connections, you will have to create multiple processes or threads, a waste of system resource.
With select
or poll
or epoll
, you can monitor multiple connections with only one thread, and get notified when any of them has data available, and then you call read
or recv
on the corresponding connection.
It may block infinitely, block for a given time, or not block at all, depending on the arguments.
select() takes in 3 lists of sockets to check for three conditions (read, write, error), then returns (usually shorter, often empty) lists of sockets that actually are ready to be processed for those conditions.
s1 = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s1.bind((Local_IP, Port1))
s1.listen(5)
s2 = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s2.bind((Local_IP, Port2))
s2.listen(5)
sockets_that_might_be_ready_to_read = [s1,s2]
sockets_that_might_be_ready_to_write_to = [s1,s2]
sockets_that_might_have_errors = [s1,s2]
([ready_to_read], [ready_to_write], [has_errors]) =
select.select([sockets_that_might_be_ready_to_read],
[sockets_that_might_be_ready_to_write_to],
[sockets_that_might_have_errors], timeout)
for sock in ready_to_read:
c,a = sock.accept()
data = sock.recv(128)
...
for sock in ready_to_write:
#process writes
...
for sock in has_errors:
#process errors
So if a socket has no attempted connections after waiting timeout seconds, then the list ready_to_read will be empty - at which point it doesn't matter if the accept() and recv() would block - they won't get called for the empty list....
If a socket is ready to read, then if will have data, so it won't block then, either.
精彩评论