Vertically Scaling PostgreSQL

PostgreSQL can scale rather well vertically. The more resources (CPU, memory, disk) that you can make available to your PostgreSQL server, the better it can perform. However, while some parts of Postgres can automatically make use of the increased resources, other parts need configuration changes before improvements can be noticed.

Read on to learn more about how to ensure PostgreSQL makes full use of the system you’re running it on.

CPU

What Scales Automatically

PostgreSQL has a traditional process architecture, consisting of a master process (called the postmaster) that spawns a new process (called a backend) for each new client connection. This means that if there are more CPU cores available, more processes can run simultaneously, and therefore backends do not have to contend as much for CPU availability. CPU-bound queries will complete faster.

You may wish to adjust the maximum allowed simultaneous connections at a system-wide, per-database or per-user level:

-- system level
ALTER SYSTEM SET max_connections = 200;

-- database level
ALTER DATABASE dbname CONNECTION LIMIT 200;

-- user level
ALTER ROLE username CONNECTION LIMIT 20;

so that rogue apps cannot end up hogging too many connections.

What Needs Tweaking

The PostgreSQL server can spawn process to take care of housekeeping tasks, like vacuuming, replication, subscriptions (for logical replication) etc. The number of such workers is not determined dynamically, but just set via configuration, and defaults to 8.

The top-level configuration setting for the number of worker process is:

# typically specified in postgresql.conf
max_worker_processes = 16

Increasing this value can result in speedup of maintenance jobs, parallel queries and index creation.

Parallel Queries

Starting with version 9.6, Postgres can execute queries parallely if the query planner decides it’ll help. Parallel querying involves spawning workers, distributing work amongst them and then collecting (gathering) the results. Subject to the overall limit of max_worker_processes set earlier, Postgres will determine how many workers can be spawned for parallel query depending on the value of two configuration settings:

# the maximum number of workers that the system can
# support for parallel operations
max_parallel_workers = 8

# the maximum number of workers that can be started
# by a single Gather or Gather Merge node
max_parallel_workers_per_gather = 8

If you have idle CPUs and parallelizable queries, increasing these values can speedup such queries.

Parallel Index Creation

In Postgres 11, support for parallel creation of B-Tree indexes was added. If you regularly create B-Tree indexes or REINDEX them, increasing this value can help:

# the maximum number of parallel workers that can be
# started by a single utility command
max_parallel_maintenance_workers = 8

This will allow Postgres to spawn these many workers (subject to the overall limit of max_worker_processes) to speed up the creation of B-Tree indexes.

Logical Replication

Logical replication (available in Postgres 10 and above), relies on worker processes at the subscription side to fetch changes from the publisher. By asking Postgres to spawn more logical replication workers, the changes can be fetched and applied in parallel, especially if there are more tables. This configuration setting increases the total number of replication workers:

# maximum number of logical replication workers
max_logical_replication_workers = 8

In streaming replication, you can start off a sync with a base backup. For logical replication however, changes have to be pulled in via the replication protocol itself, over the network. This can be time consuming. Allowing for more workers during the sync phase can speed up this process:

# basically the number of tables that are synced in
# parallel during initialization of subscription
max_sync_workers_per_subscription = 8

Autovacuum

Periodically, based on a bunch of configuration settings, Postgres will spwan a bunch of workers that will VACUUM the database tables. This is of course, called autovacuum, and the number of workers that the autovacuum launcher spawns each time can be set via the configuration setting:

# the maximum number of autovacuum processes
autovacuum_max_workers = 8

WAL Compression

If you have CPU to spare, you can trade CPU for disk bandwidth by compressing the pages that are written into the WAL files. This reduces the amount of data that needs to be written to disk, at the expense of more CPU cycles to compress the data. It also reduces the size of data that needs to be sent across the wire for streaming replication.

Practically, the benefits of WAL compression are well worth the very reasonable overhead. To turn it on, use:

# compresses full page images written to WAL
wal_compression = on

Memory

What Scales Automatically

The OS automatically manages and uses memory that is unused by any application for caching data read from and written to the disk recently. This greatly speeds up disk-intensive applications, and certainly PostgreSQL.

In Linux, the most popular host for Postgres, the size of the OS disk cache cannot be set by the user. It’s management is internal to Linux. Under memory pressure, it will yield disk cache memory to applications.

What Needs Tweaking

Query Planner

The query planner has to include the amount of disk cache provided by the OS as a factor into it’s estimations. If you’ve managed to increase the OS disk cache significantly (by increasing the available memory), increasing this configuration setting might help in improving the planner’s estimates:

# the planner's assumption about the effective size
# of the disk cache that is available to a single query.
effective_cache_size = 64GB

Shared Memory

PostgreSQL uses a set of buffers that is shared between all workers and backend processes. These are called shared buffers, and the amount of memory allocated for shared buffers is set using the configuration setting:

shared_buffers = 32GB

Temporary Buffers

When temporary tables are accessed by a query, buffers are allocated to cache the contents that are read in. The size of this buffer is set using the configuration setting:

# the maximum number of temporary buffers used
# by each database session
temp_buffers = 100MB

If you have memory to spare and queries that use temporary tables heavily, increasing this value can speed up such queries.

Working Memory

Working memory is allocated locally and privately by backends. It is used to fulfil sorts and joins without having to create into temporary tables. Increasing this from the default of 4MB can let queries complete faster by during temporary table creation:

# the amount of memory to be used by internal sort
# operations and hash tables before writing to temporary disk files
work_mem = 16MB

Maintenance Operations

The memory used by VACUUM, index creation and other such maintenance commands are controlled by the configuration setting maintenance_work_mem. Increasing this amount can speed up these operations, especially on indexes or tables that need to be recreated.

The memory used by the autovacuum workers can be taken from the maintenance work memory (by setting autovacuum_work_mem = -1) or configured independently.

# the maximum amount of memory to be used by
# maintenance operations
maintenance_work_mem = 128MB

# maximum amount of memory to be used by each
# autovacuum worker process
autovacuum_work_mem = -1

Disk

What Scales Automatically

Disks can be made bigger, faster or more concurrent. The size of the disk is the only thing that PostgreSQL doesn’t have to be instructed about. By default, PostgreSQL will not constrain itself from using any available disk space. This is usually just fine.

You can place a limit on the total size of temporary files created though, to provide some amount of protection against queries that try to sort a billion rows and the like:

# the maximum amount of disk space that a process
# can use for temporary files
temp_file_limit = 500GB

What Needs Tweaking

Concurrency

RAID-ed disks and file systems like ZFS can be setup to support more concurrency. That is to say, you can have a few disk reads/writes being serviced concurrently by such file systems because of the way the store or handle data internally.

You can let Postgres issue multiple concurrent disk I/O, by using this configuration setting:

# the number of concurrent disk I/O operations that
# PostgreSQL expects can be executed simultaneously
effective_io_concurrency = 4

This is currently used only by bitmap heap scans though.

Random Page Cost

The Postgres query planner assumes that sequential reads are faster than random reads. Exactly how much faster is a value you can tweak. By default, it assumes random reads are 4 times costlier.

Depending on your disk setup, workload and benchmarking, if you’re sure that random reads are say, only twice as costly as sequential reads, you can tell that to Postgres:

# the planner's estimate of the cost of a disk page
# fetch that is part of a series of sequential fetches
seq_page_cost = 1

# the planner's estimate of the cost of a
# non-sequentially-fetched disk page
random_page_cost = 2

Tablespaces

To take advantage of multiple disks which are not mounted as one big single filesystem, you can use tablespaces. With tablespaces, you can places tables or indexes different filesystems. This can improve concurrency and provides an easy way to handle table growth.

CREATE TABLESPACE disk2 LOCATION '/mnt/disk2/postgres';

Network

Network is usually the least used resource on a PostgreSQL server, and is rarely saturated. If you do need to scale, it is easy enough to add more network interfaces each with it’s own IP and have PostreSQL listen on them all:

listen_addresses = '10.1.0.10,10.1.0.11'

The clients will have to be load balanced across all the IPs that Postgres listens on.

Other

There are a few other configuration settings that can be tweaked, most of which use up more CPU and memory.

Partitionwise Operations

Postgres 10 introduced table partitioning, which was improved upon in Postgres 11. Some query optimizations on partitions are not turn on by default, as they might result in higher CPU and memory consumption. These are:

# allow a join between partitioned tables to be
# performed by joining the matching partitions
enable_partitionwise_join = on

# allow grouping or aggregation on a partitioned
# tables performed separately for each partition
enable_partitionwise_aggregate = on

About pgDash

pgDash is a modern, in-depth monitoring solution designed specifically for PostgreSQL deployments. pgDash shows you information and metrics about every aspect of your PostgreSQL database server, collected using the open-source tool pgmetrics. pgDash provides core reporting and visualization functionality, including collecting and displaying PostgreSQL information and providing time-series graphs, detailed reports, alerting, teams and more.

pgDash can monitor replication, locks, queries and more. It can also provide quick diagnostics and extensive alerts. Learn more here or signup today for a free trial.