There are a lot of ways in which you can get the Postgres server to run pre-defined code. Below is a comprehensive list, with examples, of ways in which you can let the Postgres server store pre-defined logic, which you can use later from your application.
Postgres lets you create “user-defined functions”, where the function body can be written in a supported language. “SQL Functions” are user-defined functions written in regular SQL, which is the simplest way to encapsulate complex queries and sequences of SQL statements.
Here are a couple of examples:
SQL functions can accept and return base types, composite types and rows. They also support variable numbers of arguments, default values for arguments and polymorphic arguments. They can even return multiple rows, mimicking a SELECT from a table. It is also not necessary for them to return anything at all.
The function body can only contain SQL statements however. This means there are no flow control statements (if, while, …), variables and the like.
The CREATE FUNCTION command is used to create the function. As usual, you can ALTER and DROP them.
This is a great place to start digging in further: https://www.postgresql.org/docs/current/xfunc-sql.html
While SQL functions are the easiest to write and least powerful, at the other end of the spectrum, functions can be written in C and can pretty much do anything. Such functions need to be coded in C and built as a shared library that can be dynamically loaded by Postgres.
You need to tell Postgres where to load the shared library, the name and signature of the function:
This says that the shared library
myfuncs.so, present in a pre-defined search
path, contains entrypoints which can be called by Postgres, with one of the
entrypoints being ‘sum’ which can be invoked as a function.
The actual code in C would be too long to include here, but you can read all about it in the docs. Combined with the Server Programming Interface (SPI), it is possible to do almost any operation that you can do in any other way.
For example, with the C functions defined here, you can perform HTTP requests:
It is also possible to write such shared libraries in other languages like C++ or Go, which can build shared libraries with “C” linkages.
Other than SQL and C, you can write functions in procedural languages. Four such languages are supported by core PostgreSQL – pgSQL, Python, Perl and Tcl. Support for any procedural language itself comes from a C shared library, and operates much like mod_perl or mod_python from the Apache era.
pgSQL is the canonical, most-used, SQL-like language in which stored functions
for PostgreSQL are written. It is available by default, courtesy of it
being installed in
PL/pgSQL is a full-fledged language with variables, expressions and control statements; and includes features like cursors to work with SQL data in particular. It is extensively documented here.
Here is an example:
Other Core Procedural Languages
The other procedural languages – Python, Perl, Tcl – allow developers to use a language they are already comfortable with. Although support for these languages live in the Postgres source tree, distributions don’t usually install the binaries by default. For example, in Debian you may have to do:
to install the PL/Python support for PostgreSQL 11.
Whatever language you are using to write a function in, the caller does not perceive any differences in it’s usage.
The PL/Python extension supports writing functions in Python 2 and Python 3. To install it, do:
Here is a function written in PL/Python:
The Python environment that the function body runs in has a module called
automatically imported into it. This module contains methods that let you
prepare and run queries, handle transactions and work with cursors.
More information can be found in chapter 46 of the Postgres docs.
Well, yes, Perl. Postgres development, test and build processes use Perl extensively, and it is also supported as a procedural language. To start using it, make sure any relevant binary packages for your distro as installed (example “postgresql-plperl-nn” for Debian) and install the extension “plperl”.
Here is a function written in PL/Perl:
Full documentation here.
Tcl is yet another PL supported by core Postgres. Here is an example:
For more information, see the docs here.
Non-Core Procedural Languages
Beyond these languages, there are open source projects that develop and maintain support for others like Java, Lua, R etc.
There is a list here: https://www.postgresql.org/docs/current/external-pl.html
Aggregate functions operate over a set of values, and return a single result. PostgreSQL has a bunch of built-in aggregate functions (see a full list here). For example, to get the population standard deviation of all the values in a column, you can:
You can define your own aggregate functions that behave in a similar fashion. A user-defined aggregate is assembled from a few individual standalone functions that work on the internal state (for example, the internal state of an aggregate that computes average could be “sum” and “count” variables).
Here is a user-defined aggregate that computes the median of a set of values:
which can be invoked as:
For more information, see the docs about aggregates and the CREATE AGGREGATE statement.
The shared libraries written in C that we saw earlier can not just define functions, but also data types. These user-defined types can be used as data types for columns just like the built-in types. You can define functions to work with the values of your user-defined types.
It takes a bit of code to define a new type. See the docs here that walk you through creating a new type to represent complex numbers. The Postgres source also contains tutorial code for this.
Operators make functions easier to use (for example, writing
1 + 2 rather
sum(1, 2)), and you can define operators for user-defined types using
the CREATE OPERATOR
Here is an example to create a
+ operator that maps to the function
complex_add that adds two
More information here and here.
Operator Classes and Operator Families
Operator classes let your data type work with the built-in B-Tree and other indexing methods. For example, if you want to create a B-Tree index on a column of type “complex”, you’ll need to tell Postgres how to compare two values of this type to determine if one is less, equal or greater than the other.
It’d be good for complex types to be compared against integers or floating point values, which is where operator families come in.
You can read all about operator classes and families here.
Triggers are a powerful mechanism for creating side-effects for normal operations, although they can be dangerous if overused or abused. Essentially, triggers connect events to functions. The function refered to can be invoked:
- before or after the insert/update/delete of a row of a table
- on truncate of a table
- instead of insert/update/delete of a row of a view
The function can be invoked for each row affected by a statement, or once per statement. And there are even more things, like cascading of triggers, all of which are explained here.
Trigger functions can be written in C, or in any of the PL functions, but not in SQL. Here is an example to insert a row into an audit table, for every update made to the price of an item.
Read all about triggers here, alongwith the CREATE TRIGGER documentation.
While triggers respond to DML events on a single table, event triggers can respond to DDL events on a particular database. Events include create, alter, drop of a variety of objects, like tables, indexes, schemas, views, functions, types, operators etc.
Here is an event trigger that prevents dropping of objects from the ‘audit’ schema:
More information can be found here and in the CREATE EVENT TRIGGER documentation.
PostgreSQL comes with a feature that lets you rewrite queries before it makes it’s way to the query planner. The operation is somewhat similar to configuring Nginx or Apache to rewrite an incoming URL before processing it.
Here are two examples that affect INSERT statements on a table and make them do something else:
This chapter from the documentation has more information about rules.
Starting with Postgres 11, it is possible to create stored procedures too. Compared with stored functions, there is only one extra thing that procedures can do – transaction control.
Here is an example:
See here and here for more information.
Other Exotic Things
Foreign Data Wrappers
Foreign Data Wrappers (FDWs) let you talk to other sources of data, like another Postgres server, MySQL, Oracle, Cassandra and more. All the logic for accessing the foreign server is written in C, as a shared library.
There is even a columnar store called cstore_fdw based on FDW.
You can find a list of FDW implementation in the Postgres Wiki and more documentation here.
Index Access Methods
PostgreSQL comes with index types like B-Tree, hash, GIN and more. It is possible to write your own index type similar to this, as a C shared library. More details here.
Table Access Methods
With the upcoming PostgreSQL 12, it’ll be possible to create your own data storage structure. By implementing the interface described here, you can store the tuple data physically on-disk in the manner of your choosing.
Logical Replication Plugins
In PostgreSQL, logical replication is implemented by a “decoding” the contents of the write-ahead log (WAL) into an arbitrary format (like SQL text or json) and published to subscribers over replication slots. This decoding is done via logical decoding output plugin, which can be implemented as a C shared library as described here. The “test_decoding” shared library is one such plugin, and you can build your own.
Procedural Language Handler
You can also add support for your favorite programming language as a Postgres PL by creating a handler – again as a C shared library. Get started here to create PL/Go or PL/Rust!
Extensions are Postgres’ way of package management. Say you have a C function that does something useful, and a couple of SQL statements that make the necessary “CREATE FUNCTION” statements to set it up. You can bundle these as an “extension” that Postgres can install (and uninstall) in one single step (by calling “CREATE EXTENSION”). When you put out a new version, you can also include upgrade steps also in the extension.
While not server-side programming per-se, extensions are the standard and very efficient way to package up and distribute your server-side code.
More information about extensions can be found here and here.
pgDash is a modern, in-depth monitoring solution designed specifically for PostgreSQL deployments. pgDash shows you information and metrics about every aspect of your PostgreSQL database server, collected using the open-source tool pgmetrics. pgDash provides core reporting and visualization functionality, including collecting and displaying PostgreSQL information and providing time-series graphs, detailed reports, alerting, teams and more.
pgDash can monitor replication, locks, queries and more. It can also provide quick diagnostics and extensive alerts. Learn more here or signup today for a free trial.