PostgreSQL 12 comes with a new feature called generated columns. Other popular RDBMSes already support generated columns as “computed columns” or “virtual columns.” With Postgres 12, you can now use it in PostgreSQL as well. Read on to learn more.

What is a generated column?

A generated column is sort of like a view, but for columns. Here’s a basic example:

db=# CREATE TABLE t (w real, h real, area real GENERATED ALWAYS AS (w*h) STORED);
db=# INSERT INTO t (w, h) VALUES (10, 20);
db=# SELECT * FROM t;
 w  | h  | area
 10 | 20 |  200
(1 row)


We created a table t with two regular columns called w and h, and a generated column called area. The value of area is computed at row creation time, and is persisted onto the disk.

The value of generated columns are recomputed when the row is updated:

db=# UPDATE t SET w=40;
db=# SELECT * FROM t;
 w  | h  | area
 40 | 20 |  800
(1 row)


Such functionality was earlier usually achieved with triggers, but with generated columns this becomes much more elegant and cleaner.

A few points you should know about generated columns:

  • Persistence: Currently, the value of generated columns have to be persisted, and cannot be computed on the fly at query time. The “STORED” keyword must be present in the column definition.
  • The Expression: The expression used to compute the value has to be immutable, that is, it has to be deterministic. It can depend on other columns, but not other generated columns, of the table.
  • Indexes: Generated columns can be used in indexes, but cannot be used as a partition key for partitioned tables.
  • Copy and pg_dump: The values of generated columns are omitted in the output of “pg_dump” and “COPY table” commands, as it is unnecessary. You can explicitly include them in COPY using COPY (SELECT * FROM t) TO STDOUT rather than COPY t TO STDOUT.

A Practical Example

Let’s add full text search support to a table using generated columns. Here’s a table that stores the entire text of all of Shakespeare’s plays:

    workid text,       -- denotes the name of the play (like "macbeth")
    act integer,       -- the act (like 1)
    scene integer,     -- the scene within the act (like 7)
    description text,  -- short desc of the scene (like "Macbeth's castle.")
    body text          -- full text of the scene

Here’s how the data looks:

shakespeare=# SELECT workid, act, scene, description, left(body, 200) AS body_start
shakespeare-# FROM scenes WHERE workid='macbeth' AND act=1 AND scene=1;
 workid  | act | scene |   description   |                  body_start
 macbeth |   1 |     1 | A desert place. | [Thunder and lightning. Enter three Witches]+
         |     |       |                 |                                             +
         |     |       |                 | First Witch: When shall we three meet again +
         |     |       |                 | In thunder, lightning, or in rain?          +
         |     |       |                 |                                             +
         |     |       |                 | Second Witch: When the hurlyburly's done,   +
         |     |       |                 | When the battle's lost and won.             +
         |     |       |                 |
(1 row)

We’ll add a column that will contain the lexemes in the value of “body”. The function to_tsvector returns the lexemes we need:

shakespeare=# SELECT to_tsvector('english', 'move moving moved movable mover movability');
 'movabl':4,6 'move':1,2,3 'mover':5
(1 row)

The type of the value returned by to_tsvector is tsvector.

Let’s alter the table to add a generated column:

  ADD tsv tsvector
    GENERATED ALWAYS AS (to_tsvector('english', body)) STORED;

You can see the change with \d:

shakespeare=# \d scenes
                                                Table "public.scenes"
   Column    |   Type   | Collation | Nullable |                               Default
 workid      | text     |           | not null |
 act         | integer  |           | not null |
 scene       | integer  |           | not null |
 description | text     |           |          |
 body        | text     |           |          |
 tsv         | tsvector |           |          | generated always as (to_tsvector('english'::regconfig, body)) stored
    "scenes_pkey" PRIMARY KEY, btree (workid, act, scene)

And just like that, you can now do full text searches:

shakespeare=# SELECT
  workid, act, scene, ts_headline(body, q)
    workid, act, scene, body, ts_rank(tsv, q) as rank, q
    scenes, plainto_tsquery('uneasy head') q
    tsv @@ q
    rank DESC
) p
  rank DESC;
  workid  | act | scene |                        ts_headline
 henry4p2 |   3 |     1 | <b>Uneasy</b> lies the <b>head</b> that wears a crown.   +
          |     |       |                                                          +
          |     |       |    Enter WARWICK and Surrey                              +
          |     |       |                                                          +
          |     |       | Earl of Warwick
 henry5   |   2 |     2 | <b>head</b> assembled them?                              +
          |     |       |                                                          +
          |     |       | Lord Scroop: No doubt, my liege, if each man do his best.+
          |     |       |                                                          +
          |     |       | Henry V: I doubt not that; since we are well persuaded   +
          |     |       | We carry not a heart with us from hence
(2 rows)


Read More

If you have a need for pre-computed / “cached” data, especially with a workload of few writes and lots of reads, generated columns should help simplify your application / server-side code a lot.

You can read the v12 documentation of CREATE TABLE and ALTER TABLE to see the updated syntax.

About pgDash

pgDash is a modern, in-depth monitoring solution designed specifically for PostgreSQL deployments. pgDash shows you information and metrics about every aspect of your PostgreSQL database server, collected using the open-source tool pgmetrics. pgDash provides core reporting and visualization functionality, including collecting and displaying PostgreSQL information and providing time-series graphs, detailed reports, alerting, teams and more.

pgDash Replication Monitoring

pgDash can monitor replication, locks, queries and more. It can also provide quick diagnostics and extensive alerts. Learn more here or signup today for a free trial.