In this article I would like to share the newest addition to our technology stack “PG Bouncer” Background We at housing.com, have a service oriented architecture with most of our services backed by a RoR stack, deployed on a Unicorn web server, and an array of data stores like PostgreSQL(primary), Elasticsearch, Aerospike, Redis, Mongo etc. Why Unicorn? […]
In this article I would like to share the newest addition to our technology stack “PG Bouncer”
We at housing.com, have a service oriented architecture with most of our services backed by a RoR stack, deployed on a Unicorn web server, and an array of data stores like PostgreSQL(primary), Elasticsearch, Aerospike, Redis, Mongo etc.
The limiting factor of the Ruby concurrency model is the Global Interpreter Lock. This feature of Ruby prevents more than 1 thread of Ruby from executing at any given time per process. Unicorn is a Rack HTTP server that uses forked processes to handle multiple incoming requests concurrently.Unicorn forks multiple OS processes within each dyno to allow a Rails app to support multiple concurrent requests without requiring them to be thread-safe.
Exactly one persistent connection to the database is established per worker process (since every worker process serves one request at a time). This design comes with following flaws:
PgBouncer is an open-source, lightweight, single-binary connection pooler for PostgreSQL. It can pool connections to one or more databases (on possibly different servers) and serve clients over TCP and Unix domain sockets. PgBouncer maintains a pool of connections for each unique user, database pair.
We had 3 options for where to deploy this layer:
Adding PgBouncer on the database instance itself led to performance issues since load on database would impact the connection pool as well. Adding an intermediate layer would add an additional hop thereby adding latency and a maintenance overhead. So we went ahead with option 3 and since PgBouncer is a very lightweight process, it has minimal impact on the RoR application.
Steps to install pgbouncer can be found here. The basic configuration file for PgBouncer is the pgbouncer.ini . A sample configuration file looks like this
There are two sections in the config:
The detailed list for all the settings can be found here. However, lets dive into some of the key configurations.
auth_type : Pgbouncer supports multiple auth types for client authecntication. Some of them are plain, md5, trust, cert, hba. We used host based authentication. A sample hba.conf file would look like :
default_pool_size: How many server connections to allow per user/database pair. Can be overridden in the per-database configuration with pool_size property
query_wait_timeout: Maximum time queries are allowed to spend waiting for execution. If the query is not assigned to a server during that time, the client is disconnected. It also helps when the server is down or database rejects connections for any reason.
pool_mode: PgBouncer can be configured with three different types of pooling:
Transaction pooling suited our needs but it comes with its own issues like no support for prepared statements. The entire list of features supported for various pooling modes can be found here
The two properties pool_size and query_wait_timeout in conjunction resolve our two problems mentioned earlier. For one of our applications running 15 worker process on a unicorn server, we configured the pool size of 8 and a query_wait_timeout of 50ms. This has helped us reducing the number of idle connections to half and in rare cases of latent or unresponsive db, 7 of our worker process are free to process requests that interact with ES or cache and are not dependent on Postgres.
Finally we make the above changes to the application’s database.yml and voila!!