PostgreSQL/Extensions

PostgreSQL offers an extensibility architecture and implements its internal data types, operators, functions, indexes, and more on top of it. This architecture is open for everybody to implement and add his own functionality to the PostgreSQL system. You can define new datatypes with or without special operators and functions as needed by your use case. After you have added them, you have the best of two worlds: the special functionalities you have created plus the standard functionality of a database system like ACID, SQL, security, standard data types, WAL, client APIs, ... . An introduction to extensibility is given in the PostgreSQL documentation.

Over time the community has developed a set of extensions that are useful for their own needs and for a great number of applications - sometimes even for the requirements and definitions given by standardization organizations. Some popular examples are

The lifecycle of such an extension starts with the implementation of its features by a group of persons or a company. After publishing, the extension may be used and further expanded by other persons or companies of the community. Sometimes such extensions keep independent from the PostgreSQL system, e.g.: PostGIS, in other cases they are delivered with the standard download and explicitly listed as an Additional Supplied Module within the documentation with hints how to install them. And in rare cases, extensions are incorporated into the core system so that they become a native part of PostgreSQL.

To activate and use an extension, you must download and install the necessary files (if not delivered with the standard download) and issue the command CREATE EXTENSION <extension_name>; within an SQL client like psql. To control which extensions are already installed use: \dx within psql.

PostGIS edit

PostGIS is a project which extends PostgreSQL with a rich set of 2D and 3D spacial data types plus associated functions, operators and index types as defined by OSGeo and SQL Multimedia and Application Packages Part 3: Spatial. Typically data types are polygon or multipoint, typical functions are st_length() or st_contains(). The appropriated index type for spatial objects is the GiST index.

The PostGIS project has its own representation on the WEB where all its aspects are described, especially the download process and the activation of the extension itself.

Foreign Data Wrappers edit

Foreign Data Wrappers (FDW) are PostgreSQL extensions that offer access to data outside of the actual database and instance. There are different types of data wrappers:

  • One wrapper to other PostgreSQL instances: postgres_fdw
  • A lot of wrappers to other relational database systems like Oracle, MySQL, MS SQL Server, ...
  • A lot of wrappers to NoSQL database systems: CouchDB, MongoDB, Cassandra, ...
  • Generic wrappers to ODBC and JDBC
  • A lot of wrapper to files of different formats: csv, xml, json, tar, zip, ... (file_fdw)
  • LDAP wrapper
  • ... and more.

A comprehensive list gives an overview.

The technique of FDW is defined in the SQL standard Part 9: Management of External Data.

Here is an example of how to access another PostgreSQL instance via FDW.

-- Install the extension to other PostgreSQL instances
CREATE EXTENSION postgres_fdw;

-- Define the connection to a database/instance at a different server
CREATE SERVER remote_geo_server
  FOREIGN DATA WRAPPER postgres_fdw
  OPTIONS (host '10.10.10.10', port '5432', dbname 'geo_data');

-- Define a user for the connection (The remote user must have access rights at the remote database)
CREATE USER MAPPING FOR CURRENT_USER
  SERVER remote_geo_server
  OPTIONS (user 'geo_data_user', password 'xxx');

-- Define two foreign tables via an IMPORT command ...
IMPORT FOREIGN SCHEMA geo_schema
  LIMIT TO (city, point_of_interest)
  FROM SERVER remote_geo_server
  INTO my_schema;

-- .. and another foreign table via an explicit definition
CREATE FOREIGN TABLE remote_person (
  id            SERIAL,
  person_name   TEXT         NOT NULL,
  city_id       INT4         NOT NULL
)
SERVER remote_geo_server
OPTIONS(schema_name 'geo_schema', table_name 'person');

After the execution of the above statements you have access to the three tables city, point_of_interest and remote_person with the usual DML commands SELECT, UPDATE, COMMIT, ... . Nevertheless the data keeps at the 'remote' server (10.10.10.10), queries are executed there, and only the results of queries are transfered via network to the actual instance and your client application.

SELECT count(*) FROM city; -- table 'city' resides on a different server

Bidirectional Replication (BDR) edit

BDR is an extension that allows replication in both directions between involved (master-) nodes in parallel to their regular read and writes activities of their client applications. So it realizes a multi-master replication. Actually, the project is a standalone project. But multiple technologies emerging from BDR development have already become an integral part of core PostgreSQL, such as Event Triggers, Logical Decoding, Replication Slots, Background Workers, and more.