开发者

How can I determine what my database's connection limits should be?

At my organization, PostgreSQL databases are created with a 20-connection limit as a matter of policy. This tends to interact poorly when multiple applications are in play that use connection pools, since many of those open up their full suite of connections and hold them idle.

As soon as there are more than a couple of applications in contact with the DB, we run out of connections, as you'd expect.

Pooling behaviour is a new thing here; until now we've managed pooled connections by serializing access to them through a web-based DB gateway (?!) or by not pooling anything at all. As a consequence, I'm having to explain (literally, 5 trouble tickets from one person over the course of the project) over and over again how the pooling works.

What I want is one of the following:

  1. A solid, inarguable rationale for increasing the number of available connections to the database in order to play nice with pools.

    If so, what's a safe limit? Is there any reason to keep the limit to 20?

  2. A reason why I'm wrong and we should cut the size of the pools down or eliminate them altogether.

For what it's worth, here are the components in play. If it's relevant how one of these is configured, please weigh in:

DB: PostgreSQL 8.2. No, we won't be upgrading it as part of this.

Web server: Python 2.7, Pylons 1.0, SQLAlchemy 0.6.5, psycopg2

  • This is complicated by the fact that some aspects of the system access data using SQLAlchemy ORM using a manually configured engine, while others access data using a different engine factory (Still sqlalchemy) written by one of my asso开发者_如何学Pythonciates that wraps the connection in an object that matches an old PHP API.

Task runner: Python 2.7, celery 2.1.4, SQLAlchemy 0.6.5, psycopg2


I think it's reasonable to require one connection per concurrent activity, and it's reasonable to assume that concurrent HTTP requests are concurrently executed.

Now, the number of concurrent HTTP requests you want to process should scale with a) the load on your server, and b) the number of CPUs you have available. If all goes well, each request will consume CPU time somewhere (in the web server, in the application server, or in the database server), meaning that you couldn't process more requests concurrently than you have CPUs. In practice, it's not that all goes well: some requests will wait for IO at some point, and not consume any CPU. So it's ok to process some more requests concurrently than you have CPUs.

Still, assuming that you have, say, 4 CPUs, allowing 20 concurrent requests is already quite some load. I'd rather throttle HTTP requests than increasing the number of requests that can be processed concurrently. If you find that a single request needs more than one connection, you have a flaw in your application.

So my recommendation is to cope with the limit, and make sure that there are not too many idle connections (compared to the number of requests that you are actually processing concurrently).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜