I can't understand polling/select in python

2023-04-05 10:42 问答作者：

I'm doing some threaded asynchronous networking experiment in python, using UDP.

I'd like to understand polling and the select python module, I've never used them in C/C++.

What are those for ? I kind of understand a little select, but does it block while watching a resource ? What is the pur开发者_Python百科pose of polling ?

Okay, one question a time.

What are those for?

Here is a simple socket server skeleton:

s_sock = socket.socket()
s_sock.bind()
s_sock.listen()

while True:
    c_sock, c_addr = s_sock.accept()
    process_client_sock(c_sock, c_addr)

Server will loop and accept connection from a client, then call its process function to communicate with client socket. There is a problem here: process_client_sock might takes a long time, or even contains a loop(which is often the case).

def process_client_sock(c_sock, c_addr):
    while True:
        receive_or_send_data(c_sock)

In which case, the server is unable to accept any more connections.

A simple solution would be using multi-process or multi-thread, just create a new thread to deal with request, while the main loop keeps listening on new connections.

s_sock = socket.socket()
s_sock.bind()
s_sock.listen()

while True:
    c_sock, c_addr = s_sock.accept()
    thread = Thread(target=process_client_sock, args=(c_sock, c_addr))
    thread.start()

This works of course, but not well enough considering performance. Because new process/thread takes extra CPU and memory, not idle for servers might get thousands connections.

So select and poll system calls tries to solve this problem. You give select a set of file descriptors and tell it to notify you if any fd is ready to read/write/ or exception happens.

does it(select) block while watching a resource?

Yes, or no depends on the parameter you passed to it.

As select man page says, it will get struct timeval parameter

int select(int nfds, fd_set *readfds, fd_set *writefds,
       fd_set *exceptfds, struct timeval *timeout);

struct timeval {
long    tv_sec;         /* seconds */
long    tv_usec;        /* microseconds */
};

There are three cases:

timeout.tv_sec == 0 and timeout.tv_usec = 0

No-blocking, return immediately
timeout == NULL

block forever until a file descriptor is ready.
timeout is normal

wait for certain time, if still no file descriptor is available, timeout and return.

What is the purpose of polling ?

Put it into simple words: polling frees CPU for other works when waiting for IO.

This is based on the simple facts that

CPU is way more faster than IO
waiting for IO is a waste of time, because for the most time, CPU will be idle

Hope it helps.

If you do read or recv, you're waiting on only one connection. If you have multiple connections, you will have to create multiple processes or threads, a waste of system resource.

With select or poll or epoll, you can monitor multiple connections with only one thread, and get notified when any of them has data available, and then you call read or recv on the corresponding connection.

It may block infinitely, block for a given time, or not block at all, depending on the arguments.

select() takes in 3 lists of sockets to check for three conditions (read, write, error), then returns (usually shorter, often empty) lists of sockets that actually are ready to be processed for those conditions.

s1 = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s1.bind((Local_IP, Port1))
s1.listen(5)

s2 = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s2.bind((Local_IP, Port2))
s2.listen(5)

sockets_that_might_be_ready_to_read = [s1,s2]
sockets_that_might_be_ready_to_write_to = [s1,s2]
sockets_that_might_have_errors = [s1,s2]


([ready_to_read], [ready_to_write], [has_errors])  = 
       select.select([sockets_that_might_be_ready_to_read],
                     [sockets_that_might_be_ready_to_write_to], 
                     [sockets_that_might_have_errors],            timeout)


for sock in ready_to_read:
    c,a = sock.accept()
    data = sock.recv(128)
    ...
for sock in ready_to_write:
    #process writes
    ...
for sock in has_errors:
    #process errors

So if a socket has no attempted connections after waiting timeout seconds, then the list ready_to_read will be empty - at which point it doesn't matter if the accept() and recv() would block - they won't get called for the empty list....

If a socket is ready to read, then if will have data, so it won't block then, either.

继续阅读：epoll multithreading polling python sockets

I can't understand polling/select in python

What are those for?

does it(select) block while watching a resource?

What is the purpose of polling ?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

What are those for?

does it(select) block while watching a resource?

What is the purpose of polling ?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？