The upcoming version of PostgreSQL, v17, is set to include a feature that allows splitting and merging of range-partitioned tables. Read on for a quick teaser on what this looks like in practice.

Disclaimer: Note that this blog post is based on the beta 2 release of PostgreSQL 17. The details of this feature and its usage might change between now and the final release of PostgreSQL 17.

Declarative Partitioning In PostgreSQL

PostgreSQL can be instructed to shard a table into partitions, each partition (a table by itself) storing a disjoint set of rows. Global constraints and indexes are allowed on the parent table with some restrictions. The application remains unaware of the child tables, and interacts with the parent table using regular DML commands. This type of sharding is called declarative partitioning and was introduced in PostgreSQL 10.

For example, consider a table used by an application to record the arrival delay of each flight at all airports in a region:

CREATE TABLE flight_arrival_delays (
	airport_code  text, -- eg "BKK"
	flight_number text, -- eg "TG326"
	arrival_date  date, -- eg "2024-07-25"::date
	delay_minutes int   -- eg 15
);

The table has a row for each flight that lands at any airport in the region. Given that BKK alone averaged 472 arrivals per day in June 2024, we expect the rows in this table to grow steadily each day.

With declarative partitioning, the DBA can instead create a partitioned parent table:

CREATE TABLE flight_arrival_delays (
	airport_code  text, -- eg "BKK"
	flight_number text, -- eg "TG326"
	arrival_date  date, -- eg "2024-07-25"::date
	delay_minutes int   -- eg 15
) PARTITION BY RANGE (arrival_date);

CREATE TABLE flight_arrival_delays_default
	PARTITION OF flight_arrival_delays
	DEFAULT;

which instructs Postgres to store all the rows in a table called flight_arrival_delays_default, but also to present the same table (flight_arrival_delays) to the application.

You can see that the rows are actually stored in the default partition:

demo=# select * from flight_arrival_delays_default;
 airport_code | flight_number | arrival_date | delay_minutes
--------------+---------------+--------------+---------------
 BKK          | TG326         | 2024-07-25   |             7
 BKK          | TG326         | 2024-07-24   |             4
 BKK          | TG326         | 2024-07-23   |             0
 BKK          | TG326         | 2024-07-22   |             8
 BKK          | TG326         | 2024-07-21   |            17
 BKK          | TG326         | 2024-07-20   |            12
 BKK          | TG326         | 2023-10-10   |            10
(7 rows)

But presented to the application as though from the parent table:

demo=# select * from flight_arrival_delays;
 airport_code | flight_number | arrival_date | delay_minutes
--------------+---------------+--------------+---------------
 BKK          | TG326         | 2024-07-25   |             7
 BKK          | TG326         | 2024-07-24   |             4
 BKK          | TG326         | 2024-07-23   |             0
 BKK          | TG326         | 2024-07-22   |             8
 BKK          | TG326         | 2024-07-21   |            17
 BKK          | TG326         | 2024-07-20   |            12
 BKK          | TG326         | 2023-10-10   |            10
(7 rows)

This allows the DBA to split the partition when it grows too big.

Splitting a Partition

The new Postgres v17 command ALTER TABLE .. SPLIT PARTITION can be used to split the partition, in this case the default partition of flight_arrival_delays_default, when it grows too big:

ALTER TABLE flight_arrival_delays
	SPLIT PARTITION flight_arrival_delays_default INTO
	(PARTITION flight_arrival_delays_2023 FOR VALUES FROM ('2023-01-01') TO ('2024-01-01'),
	 PARTITION flight_arrival_delays_2024 FOR VALUES FROM ('2024-01-01') TO ('2025-01-01'),
	 PARTITION flight_arrival_delays_default DEFAULT);

This command creates 2 new partitions (one for 2023, one for 2024) and still retains the _default one to store other values. After the split, the rows get redistributed accordingly:

demo=# select * from flight_arrival_delays_2023;
 airport_code | flight_number | arrival_date | delay_minutes
--------------+---------------+--------------+---------------
 BKK          | TG326         | 2023-10-10   |            10
(1 row)

demo=# select * from flight_arrival_delays_2024;
 airport_code | flight_number | arrival_date | delay_minutes
--------------+---------------+--------------+---------------
 BKK          | TG326         | 2024-07-25   |             7
 BKK          | TG326         | 2024-07-24   |             4
 BKK          | TG326         | 2024-07-23   |             0
 BKK          | TG326         | 2024-07-22   |             8
 BKK          | TG326         | 2024-07-21   |            17
 BKK          | TG326         | 2024-07-20   |            12
(6 rows)

demo=# select * from flight_arrival_delays_default;
 airport_code | flight_number | arrival_date | delay_minutes
--------------+---------------+--------------+---------------
(0 rows)

New rows get stored using the new partition ranges:

demo=# insert into flight_arrival_delays values ('BKK', 'TG326', '11-11-2022', 10), ('BKK', 'TG326', '11-11-2023', 10);
INSERT 0 2
demo=# select * from flight_arrival_delays_2023;
 airport_code | flight_number | arrival_date | delay_minutes
--------------+---------------+--------------+---------------
 BKK          | TG326         | 2023-10-10   |            10
 BKK          | TG326         | 2023-11-11   |            10
(2 rows)

demo=# select * from flight_arrival_delays_2024;
 airport_code | flight_number | arrival_date | delay_minutes
--------------+---------------+--------------+---------------
 BKK          | TG326         | 2024-07-25   |             7
 BKK          | TG326         | 2024-07-24   |             4
 BKK          | TG326         | 2024-07-23   |             0
 BKK          | TG326         | 2024-07-22   |             8
 BKK          | TG326         | 2024-07-21   |            17
 BKK          | TG326         | 2024-07-20   |            12
(6 rows)

demo=# select * from flight_arrival_delays_default;
 airport_code | flight_number | arrival_date | delay_minutes
--------------+---------------+--------------+---------------
 BKK          | TG326         | 2022-11-11   |            10
(1 row)

The 2023 row went into the 2023 table, and the default table captured the 2022 row.

It is possible to split any partition further. If for example, there are too many rows in 2024, that partition alone can be split into say q1_2024, q2_2024 and so on, without touching the 2023 and default partitions.

Efficiency

During its execution, the ALTER TABLE .. SPLIT PARTITION command will acquire an access exclusive lock on the parent (flight_arrival_delays) and the newly created partitions. This lock is held until the end of the transaction that the command executes in.

This means that any concurrent operation on the table, including reads, will wait until the SPLIT PARTITION transaction completes. This may be an unacceptable overhead for OLTP databases. The alternative however, is to detach partitions, manipulate tables and rows “manually” and then reattach, which would be more error-prone.

Merging Partitions

The reverse operation is also available in PostgreSQL v17. Let’s try to merge the partitions we split above:

demo=# ALTER TABLE flight_arrival_delays
        MERGE PARTITIONS (
                flight_arrival_delays_2023,
                flight_arrival_delays_2024,
                flight_arrival_delays_default )
        INTO flight_arrival_delays_default;
ALTER TABLE

There are only 2 tables now, the parent one and a default partition:

demo=# \d
                          List of relations
 Schema |             Name              |       Type        | Owner
--------+-------------------------------+-------------------+--------
 public | flight_arrival_delays         | partitioned table | mdevan
 public | flight_arrival_delays_default | table             | mdevan
(2 rows)

And the default partition contains all the rows:

demo=# select * from flight_arrival_delays_default;
 airport_code | flight_number | arrival_date | delay_minutes
--------------+---------------+--------------+---------------
 BKK          | TG326         | 2023-10-10   |            10
 BKK          | TG326         | 2023-11-11   |            10
 BKK          | TG326         | 2024-07-25   |             7
 BKK          | TG326         | 2024-07-24   |             4
 BKK          | TG326         | 2024-07-23   |             0
 BKK          | TG326         | 2024-07-22   |             8
 BKK          | TG326         | 2024-07-21   |            17
 BKK          | TG326         | 2024-07-20   |            12
 BKK          | TG326         | 2022-11-11   |            10
(9 rows)

Efficiency

Again, the partitions to be merged and the new partition are all access-exclusive locked until the end of the transaction containing the MERGE PARTITION command.

Migration from Other DBMS

Other commercial DBMS have supported SPLIT and MERGE of partitions for a while, and the inclusion of this feature in PostgreSQL 17 should make it easier to migrate to PostgreSQL. In previous versions of PostgreSQL, migrating a SPLIT/MERGE would have involved translating the command into a gnarly sequence of ALTER TABLE .. DETACH PARTITION, COPY, DELETE and so on.

This is the first appearance of the feature in Postgres, and no doubt we’ll see further optimizations and concurrency improvements around this in the future.

About pgDash

pgDash is a modern, in-depth monitoring solution designed specifically for PostgreSQL deployments. pgDash shows you information and metrics about every aspect of your PostgreSQL database server, collected using the open-source tool pgmetrics.

pgDash Queries Dashboard

pgDash provides core reporting and visualization functionality, including collecting and displaying PostgreSQL information and providing time-series graphs, detailed reports, diagnostics, alerting, teams and more. Checkout the features here or signup today for a free trial.