Avatar of Joseph TaiwoJoseph Taiwo

How to Handle Database Connection Pooling

In this article, we will discuss the concept of Connection Pooling, an effective strategy for optimizing server-to-database communication. Implementing connection pooling in an application will reduce the delays in communicating with the database and maximize the usage of processing and storage resources.

To answer this question, we must first understand how connections normally function in an application.

When a request is made to an application, the server processes the request and sends a query to the database to execute the required database function. The pathway through which this query accesses the database is known as a connection. The normal connection procedure involves opening a new connection for each query that is made to the database; this connection is then closed once the query is executed.

A relatively small number of resources is used up in the process of opening and closing connections. However, this number could balloon up to significant amounts in large-scale applications that handle hundreds or thousands of requests. This translates to having system resources drained much more than usual or cause lags in the application that will affect the user experience.

Flow diagram demonstrating no connection pool

Flow diagram demonstrating no connection pool

With the introduction of connection pooling, the server maintains a pool of reusable connections that it uses to access the database. This way, requests made to the application make use of readily available connections that are released back into the pool after the specified function is completed.

Connection pooling eliminates the need to constantly open and close connections and instead uses a small number of resources to maintain the idle connections in the pool. The queries also become evidently faster with connection pooling, as connections do not need to be opened each time a query is made to the database.

Flow diagram demonstrating connection pool

Flow diagram demonstrating connection pool

If the textbook-like definition of connection pooling didn’t work for you, the Taxi system analogy is another way to grasp the concept. Assume the application is a city, with a bunch of people (users) that need to move around (make requests).

In the normal connection procedure, each individual has their own vehicle (each query has its own connection to the database). When a person drives to their destination (executes a query), the vehicle is parked (connection is closed) and the next person (query) that needs to go somewhere (execute) has to find another vehicle (connection) to use.

With this same analogy, you can liken connection pooling to the introduction of a fleet of taxis (database connections) that can move people around (process queries). Even after a person arrives at their destination (a query has executed), the taxi remains (the connection returns to the pool) to transport the next available passenger (to handle other queries made to the database).

With the explanations for connection pooling out of the way, let’s get to actually seeing it in action. Below is the list of technologies used in this demonstration.

The application will run on a Node.js Express server, so some familiarity with JavaScript is expected.

PostgreSQL is a free and open-source database commonly used for applications in production due to its robustness and scalability. While Postgres does not have an in-built connection pooling mechanism, some of its ORMs and clients provide support for the feature.

Install Postgres locally

Start by installing Postgres on your computer.

  • Go to the Postgres website and download the installer for the current version of PostgreSQL.
  • Launch the installer and follow the prompts to set up the application
  • Select the default options provided, such as installation directory or programs to be installed.
  • When prompted, set a password, which will be used to access Postgres databases later on.

With Postgres installed, you can now create a local Postgres database instance on your computer.

Deploy a Postgres instance on Railway

You can also spin up a Postgres database instance on Railway by following these steps:

  • Visit the Railway website and follow the prompts to sign in or create an account.
  • Click on the “+ New” button to start a new project and select “Deploy PostgreSQL.”

Wait a few seconds for the service to get created, and voila, an easy-to-manage Postgres database instance without the need to install a program.

With the Postgres instance up and running, the next thing is to create a basic express server.

Import the express package by running npm install express in the terminal of your desired folder, then create an index.js file with the following code.

// Import the express package
const express = require("express");

// Instantiate an app
const app = express();
const port = process.env.PORT || 3000;

// Create a basic endpoint
app.get("/", (req, res) => {
	res.send("Hello World!");
});

// Listen for the server
app.listen(port, () => {
	console.log("Server running on port: ", port);
});

Run node index.js in the terminal to start the server. Confirm that the server is running by visiting localhost:3000 in the browser.

The node-postgres (pg) library supports two types of database connections: normal (client) connections and pooled connections.

Run npm install pg to bring the package into the application, then create configurations for the client and pool connections.

// Deconstruct the Client and Pool from the pg library
const { Client, Pool } = require("pg")

// Variables for configuring our database
const dbConfig = {
	user: "postgres",
	host: "localhost",
	database: "postgres",
	// Password you used when installing postgres
	password: "<password>",
	port: 5432,
};

// // Use these set of variables if you are connecting to Railway
// const dbConfig = {
// 	user: "<PGUSER>",
// 	host: "<PGHOST>",
// 	database: "<PGDATABASE>",
// 	password: "<PGPASSWORD>",
// 	connectionString: "<DATABASE_URL>",
// 	port: <PGPORT>,
// };

// Creating a connection pool requires additional configuration
const poolConfig = {
	max: 20, // maximum number of connections in the pool
	connectionTimeoutMillis: 5000, // time it takes for a connection to attempt a query
	idleTimeoutMillis: 10000 // time it takes for an idle connection to be closed in the pool
};

Obtain the variables for the Railway Postgres instance by clicking on the deployment and navigating to the “Variables” tab.

💡
Visit the official node-postgres site for more information on the variables needed to create clients and pools.

Next, we’ll write functions that will create each of these connections and make a simple query with them.

const pool = new Pool({
	...dbConfig,
	...poolConfig,
});

// Function for normal connection
const normConnection = async () => {
	const client = new Client(dbConfig);
	// This creates a connection that allows the client to access the database
	client.connect();
	try {
		// Makes a query with the client
		await client.query("SELECT * FROM users");
	} catch (err) {
		console.error(err);
	} finally {
		// Close the client connection after making query
		await client.end();
	}
};

// Function for pooled connection
const poolConnection = async () => {
	// Checks out an idle connection in the pool to access the database
	const client = await pool.connect();
	try {
		// Makes a query with the client from the pool
		await client.query("SELECT * FROM users");
	} catch (err) {
		console.error(err);
	} finally {
		// Returns the connection back into the pool
		await client.release();
	}
};

Notice how the pool instance is created outside the poolConnection function, while the client instance is created within the normConnection function. This arrangement is necessary to ensure that both the pool and client connections function properly.

// Accessing this endpoint makes 100 requests with normal connections
app.get("/norm", async (req, res) => {
	for (let i = 0; i < 100; i++) {
		normConnection();
	}
	res.send("Made a 100 database requests with normal connection");
});

// Accessing this endpoint makes 100 requests with pooled connections
app.get("/pool", async (req, res) => {
	for (let i = 0; i < 100; i++) {
		poolConnection();
	}
	res.send("Made a 100 database requests with pooled connection");
});

Lastly, we created endpoints to call the connection functions multiple times, to simulate an application in production.

The query used in the previous section (SELECT * FROM users) retrieves all the objects in a “Users” table, which we have not created. Now, we’ll create a function called dbSetup that will setup our “Users” table and add dummy users.

const dbSetup = async () => {
	// Client connection is used to make these requests
	const client = new Client(dbConfig);
	await client.connect();
	try {
		// client.query("DROP TABLE users"); // Uncomment after running the first time
		// Creates a table of users with name and age columns
		client.query(`
			CREATE TABLE users (
				name VARCHAR(100) NOT NULL,
				age INTEGER NOT NULL
			)
		`);		
		// Inserts 100 dummy users into the database
		for (let i = 0; i < 100; i++) {
			client.query("INSERT INTO users (name, age) VALUES ($1, $2)", [
				`person${i}`,
				i,
			]);
		}
		// Returns the number of users in the database, for confirmation
		const userCount = await client.query("SELECT * FROM users");
		return userCount.rowCount;
	} catch (err) {
		console.log(err);
	}
};

app.listen(port, async () => {
	// Calls the dbSetup function once server is running
	console.log(await dbSetup())
	console.log("Server running on port: ", port);
});

Notice the commented code block for dropping the Users table. If the code wasn't commented when running the function at first, there will be an error because the Users table doesn’t exist yet. There will also be an error if the code isn’t uncommented for consequent runs since the function will attempt to create a table that already exists.

The code is setup this way so that each instance of the application works with a fresh database table, to ensure consistency in the results.

Prom-client is a Prometheus client for Node.js with support for some basic metric types. The metrics we’ll get using this library are more in depth that what we’d get if we just used the Node.js performance API.

Run npm install prom-client in the terminal and add the following modifications to the code in the index.js file.

// Import the Registry and the Histogram metric type
const { Registry, Histogram } = require("prom-client")

// Instantiate a new Registry in the register variable
const register = new Registry();

// Histogram for normal connections
const normalTimer = new Histogram({
	name: "normal_connection_duration",
	help: "Duration of queries to the database with normal connections in seconds",
	buckets: [
		0.001, 0.002, 0.003, 0.004, 0.005, 0.006, 0.007, 0.008, 0.009, 0.01,
	],
});
// Histogram for pooled connections
const pooledTimer = new Histogram({
	name: "pooled_connection_duration",
	help: "Duration of queries to the database with pooled connections in seconds",
	buckets: [
		0.001, 0.002, 0.003, 0.004, 0.005, 0.006, 0.007, 0.008, 0.009, 0.01,
	],
});

// Register our histogram metrics to the registry instance
// So our register can collect metrics from them
register.registerMetric(normalTimer);
register.registerMetric(pooledTimer);

A registry serves as the base storage for all the metrics collected from an application. In the code above, we created two Histogram metrics, one for each of the connection types and added them to the application’s register.

💡
Visit the Prom-client documentation to learn more about registers and metrics.

The Histogram is a metric that collects the time taken to complete events and puts them into groups called buckets. Database queries in this setting usually take about 1 to 10 milliseconds, but since buckets are measured in seconds, we used 0.001 to 0.010.

Next, we will modify the connection functions to collect metrics and add a “/metrics” endpoint for accessing those metrics.

const normConnection = async () => {
	const client = new Client(dbConfig);
	await client.connect();
	try {
		// Start timer right before query is made
		const end = normalTimer.startTimer();
		await client.query("SELECT * FROM users");
		// End timer immediately after
		end();

    // Return time taken to complete single query
 		return normalTimer.hashMap[""].sum;
	} catch (err) {
		console.error(err);
	} finally {
		await client.end();
	}
};

const poolConnection = async () => {
	const client = await pool.connect();
	try {
		// Start timer right before query is made
		const end = pooledTimer.startTimer();
		await client.query("SELECT * FROM users");
		// End timer immediately after
		end();
		
		// Return time taken to complete single query
		return pooledTimer.hashMap[""].sum;
	} catch (err) {
		console.error(err);
	} finally {
		await client.release();
	}
};

// "/norm" and "/pool" routes come before the "/metrics" route

Calling the startTimer method returns a function to stop the timer for the metric which it is being called on. Only the actual query to the database is placed between the startTimer and the end function to ensure that the metrics collected are on the queries alone.

💡
Try calling any of the connection functions to see how long a single query with that connection type takes.

Lastly, we created the “/metrics” endpoint to collect the metrics from the register and log them to the console.

Start the application server by running node index.js in the terminal. Once the database is connected and the server is running, make GET requests to the “/norm”, “/pool”, and finally, the “/metrics” endpoint. The following results should be logged to the console:

Observing data from Prometheus metrics endpoint

Observing data from Prometheus metrics endpoint

💡
Note: The numbers displayed above will likely be higher if you use the PostgreSQL server on Railway. This disparity is because of the network latency between your local application and the database instance on Railway.

First observation

The metrics labelled normal_connection_duration_sum and pooled_connection_duration_sum represent the total time it took to complete the normal and pooled queries respectively.  As can be seen, the pooled connection took significantly less time (almost 3 times faster to complete), than the normal connection queries.

Buckets in a histogram operate with a “less or equal” order. This means, the queries contained in a bucket are the queries that took less than or equal to the upper limit of the bucket. This feature makes more sense when trying to visualize the metrics with an actual histogram.

Second observation

Another observation we can make from the metrics is in the time taken to complete individual queries.

With normal connections, there were no queries that completed in 2 milliseconds (0.002 seconds) or less, and only 98 queries completed under 10 milliseconds (0.010 seconds).

While the pooled connection queries had up to 54 queries that took less than a millisecond, and all the queries were completed in 5 milliseconds or less.

With these apparent optimization results, you might be wondering why normal connections to the database are still a thing if connection pooling is so much better. Well, normal connections are still being used because of some of the following drawbacks of connection pooling.

  • Pool exhaustion: If the connection pool is smaller than the application workload, it will eventually run out of connections, leading to a standstill in the application until one connection becomes idle, or more connections are added to the pool.
  • Idle connections: If the connection pool is bigger than the application workload, there will be too many idle connections that are not being used, causing unnecessary usage of resources. For these reasons, it is important to use the pool size best for your application.
  • Vulnerability: Connection pools could potentially be exploited as a security vulnerability, because the connections are shared by different modules and components.
  • Additional component: Since all requests pass through the connection pool, it can serve as a single point of failure that halts the entire application process. This means the connection pool is an additional component that must be maintained, fine-tuned and updated.

While the benefits of connection pooling completely outweigh its potential drawbacks, it may not be entirely worth it to implement such a technique in small applications that are better off without the extra complexity.

Databases still remain an integral part of most applications, and in order to give users a fast and seamless experience while using an application, it is important to turn to techniques like connection pooling. Implementing connection pooling ensures that your application has fast response times, and is able to handle peak traffic periods, without suffering outage.

💡
Visit this GitHub Repo to get the code for the full index.js file.
  • This article used the node-Postgres library to demonstrate the power of the connection pooling strategy. For more important use cases, you might want to consider external connection poolers like PgBouncer, that provide a more efficient implementation.
  • Check out other application optimization techniques like Caching, a strategy that stores frequently accessed queries in memory to optimize read performance. Caching works well with connection pooling to improve the overall performance of applications with heavy workloads.
  • The metrics obtained from Prometheus are in numeric format and can be a bit difficult to make out. Consider using tools like Grafana to visualize the metrics and get more vivid representations of the performance and optimizations.
  • If you used the Postgres database instance provided by Railway, consider moving your project to the same deployment to ensure seamless connection between the two. And if you didn’t use Railway, check them out and explore their wonderful hosting services!

Thank you for reading, and happy building!