M
MeshWorld.
PostgreSQL Database Performance Backend SQL 9 min read

PostgreSQL Performance Tuning: Query Plans, Indexes, and Connection Pooling

Rachel
By Rachel

Most PostgreSQL performance problems have the same handful of root causes. Sequential scans on large tables. N+1 query patterns from ORMs. Connection pool exhaustion under load. Autovacuum falling behind on high-write tables. Once you know how to read query plans and where to look, you can diagnose most slowdowns in under an hour.

:::note[TL;DR]

  • EXPLAIN ANALYZE is your primary diagnostic tool — look for Seq Scans with high row counts and nested loops at scale
  • The most common causes of slow queries: missing indexes, N+1 patterns, poor join order on large tables
  • pg_stat_statements extension identifies the slowest queries across your entire database
  • PgBouncer in transaction-mode pooling reduces connection overhead by 80–90% under high concurrency
  • Autovacuum defaults work for most tables, but high-write tables need tuned thresholds :::

Prerequisites

  • PostgreSQL 14+ (most features here apply to pg 12+ as well)
  • Superuser or pg_monitor role for system views
  • pg_stat_statements extension available (ships with PostgreSQL, just needs enabling)

How do you find the slow queries?

Before diving into individual queries, find the worst offenders across the whole database.

Enable pg_stat_statements in postgresql.conf:

shared_preload_libraries = 'pg_stat_statements'

Reload PostgreSQL, then create the extension:

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

Now query the slowest queries by total time:

SELECT
  round(total_exec_time::numeric, 2) AS total_ms,
  calls,
  round(mean_exec_time::numeric, 2) AS mean_ms,
  round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS pct_total,
  query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

total_exec_time shows cumulative time — a query that runs a million times and takes 0.1ms each contributes 100,000ms total. mean_exec_time shows per-call cost. Both matter: high total means it’s consuming overall database capacity; high mean means individual calls are slow.

Reset stats after tuning to measure improvements from a clean baseline:

SELECT pg_stat_statements_reset();

How do you read EXPLAIN ANALYZE?

EXPLAIN shows what PostgreSQL plans to do. EXPLAIN ANALYZE runs the query and shows what it actually did.

EXPLAIN ANALYZE
SELECT o.*, u.name
FROM orders o
JOIN users u ON o.user_id = u.id
WHERE o.status = 'pending'
ORDER BY o.created_at DESC
LIMIT 50;

What to look for in the output:

Seq Scan on a large table — PostgreSQL read every row. Unless the table is small (under a few thousand rows) or the query returns most rows anyway, this usually means a missing index.

Seq Scan on orders  (cost=0.00..18340.00 rows=240 width=120)
                    (actual time=0.03..412.50 rows=240 loops=1)
  Filter: ((status)::text = 'pending'::text)
  Rows Removed by Filter: 847620

847,620 rows scanned to return 240. Add CREATE INDEX ON orders(status) or better, a partial index: CREATE INDEX ON orders(created_at DESC) WHERE status = 'pending'.

Hash Join vs Nested Loop — Hash Join is generally good for larger data sets. Nested Loop with an inner Seq Scan is the dangerous combination:

Nested Loop  (cost=0.00..98234.00 rows=240 width=240)
  ->  Seq Scan on orders  (actual rows=240 loops=1)
  ->  Seq Scan on users  (actual rows=1 loops=240)

loops=240 on the inner Seq Scan means PostgreSQL scanned the entire users table 240 times. That’s a missing index on users.id — which shouldn’t happen since id is usually a primary key with an index, but it shows the pattern. If you see loops=N on an inner node with N > 10, that node is your bottleneck.

Actual vs estimated rows — the rows=X in the cost estimate vs rows=Y in the actual. Large discrepancies mean stale statistics. Fix with ANALYZE tablename.

Buffers — add BUFFERS to see memory vs disk reads:

EXPLAIN (ANALYZE, BUFFERS)
SELECT ...

High Buffers: read=N means data is being read from disk. High Buffers: hit=N means it came from the shared buffer cache. If you’re hitting disk heavily for a frequently-accessed table, consider increasing shared_buffers.


What are the most common slow query patterns?

1. Missing index on a filtered or joined column.

Already covered above. The fix is usually a CREATE INDEX. Composite indexes when the query filters on multiple columns — remember the left-prefix rule: put equality columns before range columns.

2. N+1 queries from ORMs.

This is the most common performance problem in applications using ORMs (Sequelize, SQLAlchemy, ActiveRecord, Prisma). An N+1 occurs when loading a list of records triggers a separate query for each record’s related data:

-- Query 1: get 100 orders
SELECT * FROM orders LIMIT 100;

-- Queries 2–101: get user for each order (generated by ORM)
SELECT * FROM users WHERE id = 1;
SELECT * FROM users WHERE id = 2;
-- ...×100

The fix is eager loading (JOIN or subquery) instead of lazy loading. In SQL:

SELECT o.*, u.name, u.email
FROM orders o
JOIN users u ON o.user_id = u.id
LIMIT 100;

In ORMs: Prisma’s include, Sequelize’s include, SQLAlchemy’s joinedload. One query instead of 101.

Detect N+1 in pg_stat_statements: look for queries with extremely high calls counts and low mean_exec_time — thousands of tiny queries summing to large total_exec_time.

3. SELECT * in performance-sensitive paths.

SELECT * fetches every column including large text/JSONB columns you might not need. Name only the columns you use. Partial data is faster to transfer and easier for PostgreSQL to cache.

4. Missing LIMIT on potentially large result sets.

A query that returns 10,000 rows when the UI shows 20 wastes query time, network bandwidth, and application memory. Always paginate with LIMIT + OFFSET or, better, keyset pagination for large datasets.


How do you set up connection pooling with PgBouncer?

PostgreSQL creates an OS process for each connection. Under high concurrency (hundreds of connections), the overhead of process creation, memory, and context switching becomes significant. PgBouncer maintains a smaller pool of actual database connections and queues application requests.

Install PgBouncer (Ubuntu/Debian):

sudo apt install pgbouncer -y

Minimal /etc/pgbouncer/pgbouncer.ini:

[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20
min_pool_size = 5
server_idle_timeout = 600

pool_mode = transaction — the connection is returned to the pool after each transaction, not each session. This is the most efficient mode. The caveat: you can’t use session-level features (advisory locks, SET commands that persist across transactions, prepared statements in some configurations) with transaction mode.

Create /etc/pgbouncer/userlist.txt:

"myuser" "md5<hash>"

Generate the hash: echo -n "passwordmyuser" | md5sum — prepend “md5” to the hash.

Your application now connects to PgBouncer on port 6432 instead of PostgreSQL directly on 5432. PgBouncer multiplexes up to 1,000 client connections through 20 actual database connections.


What autovacuum settings matter?

PostgreSQL’s MVCC model keeps dead row versions after updates and deletes until autovacuum reclaims them. Default autovacuum settings are conservative — they work for typical workloads but fall behind on high-write tables.

Check autovacuum activity:

SELECT schemaname, tablename,
  n_dead_tup,
  n_live_tup,
  round(n_dead_tup::numeric / nullif(n_live_tup + n_dead_tup, 0) * 100, 1) AS dead_pct,
  last_autovacuum,
  last_autoanalyze
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 20;

Dead percentage above 10–20% on a frequently-queried table means autovacuum is behind. Symptoms: bloated tables, stale statistics, growing pg_relation_size.

Tune autovacuum for a specific high-write table:

ALTER TABLE events SET (
  autovacuum_vacuum_scale_factor = 0.01,  -- vacuum when 1% of rows are dead (default: 20%)
  autovacuum_analyze_scale_factor = 0.005, -- analyze when 0.5% rows changed
  autovacuum_vacuum_cost_delay = 2        -- reduce delay between vacuums (default: 2ms, but check pg version)
);

For large tables (millions of rows), the default autovacuum_vacuum_scale_factor = 0.2 means autovacuum doesn’t trigger until 20% of rows are dead — that’s 200,000 dead rows on a 1M-row table before vacuum kicks in. Setting scale_factor = 0.01 makes it trigger at 10,000 dead rows instead.


Summary

  • pg_stat_statements identifies slow queries by total time; high calls + low mean_ms is an N+1 signal
  • EXPLAIN (ANALYZE, BUFFERS) shows the actual execution plan; look for Seq Scans with high filtered row counts and Nested Loop inner scans with high loops
  • Missing indexes and N+1 queries are responsible for the majority of real-world PostgreSQL slowdowns
  • PgBouncer in transaction mode multiplexes hundreds of application connections through a small connection pool
  • Tune autovacuum_vacuum_scale_factor down on high-write tables — the default 20% is too permissive for large tables

FAQ

How do I know if my server is CPU-bound or I/O-bound?

Check EXPLAIN (ANALYZE, BUFFERS) output. High disk reads (Buffers: read=N) means I/O-bound — consider increasing shared_buffers (typically set to 25% of RAM) or adding faster storage. High execution time with cache hits means CPU-bound — look at parallelism (max_parallel_workers_per_gather) or query optimization. On the OS level, iostat -x 1 and top show whether the bottleneck is disk or CPU.

When should I use EXPLAIN vs EXPLAIN ANALYZE?

Use EXPLAIN when you want to see the plan without the cost of running the query — useful for DDL planning or when the query is destructive. Use EXPLAIN ANALYZE for all performance investigations — you need actual row counts and timing, not estimates. The estimates in EXPLAIN alone can be far off for tables with stale statistics.

Is it safe to run ANALYZE on a production table?

Yes. ANALYZE takes only a brief share lock on the table, doesn’t block reads or writes, and completes quickly even on large tables (it samples, not a full scan). VACUUM is also safe and non-blocking in its default form. VACUUM FULL, however, takes an exclusive lock and rebuilds the entire table — avoid it on production during business hours.