PostgreSQL/Architecture Cluster
Overview
editA Server
, which is some hardware, a container, or a VM, contains one or more Database Clusters (Cluster
for short). Every cluster is controlled by exactly one instance. If there are many clusters and instances on the same server, the ports of the instances must differ from each other as well as the root directories of the clusters.
Each newly created cluster contains the three databases
template0, template1, and postgres, each of the three databases contain the schema public as well as the system schemas pg_catalog, information_schema, pg_temp, and some more. Tables, views, and most other SQL objects reside in such schemas.
DBAs can create more clusters, databases, schemas, or SQL objects.
Initialization Phase
editClusters are created with the command initdb
. template0 is the very first database during the creation phase of any cluster. In a second step, database template1 is generated as a copy of template0, and finally database postgres is generated as a copy of template1. Later, the DBA can create more databases within that cluster, e.g.: my_db, with the command createdb
. Just like at the beginning, the new database will be a copy of template1. Due to the unique role of template0 as the pristine original of all other databases, no client is allowed to connect to it and modify it. But the DBA can change template1.
Connections
editClient connections act at the database level and can access data and SQL objects within any schema of the connected database, as far as they are permitted to do so. If they need access to any object of a different database within the same or another cluster, special techniques like foreign-data wrapper
(FDW) or dblink
are required (or they use multiple connections and synchronize them at the client-side).
SQL objects
editWe use the term SQL object for all objects which you can create with the SQL command CREATE ...
, e.g.: database, schema, table, view, materialized view, index, constraint, sequence, function, procedure, trigger, role, data type, domain, operator, tablespace, extension, foreign-data wrapper, and much more. Such SQL objects are arranged in a hierarchical manner:
- Database names, tablespaces, and roles (users) are known at the cluster level. E.g.: As mentioned above, a connection works at the database level. Nevertheless, when you create a new role with such a connection, the role is also known by all other databases of the same cluster.
- Extensions, e.g.: PostGIS, reside at the database level. After installing an extension, all schemas of this database can use it. But within the other databases of the same cluster, the extension is not known.
- Schemas are part of a database. Some of them are predefined.
- pg_catalog is a schema with tables that describe most of the SQL objects of that database, especially all tables and views. They even describe themself. information_schema is a similar schema. It contains several tables and views of pg_catalog in a way that conforms to the SQL standard.
- public acts as the default schema. It should not contain user-defined SQL objects. Instead, it is recommended to create one or more additional schemas to manage application-specific objects like tables or triggers. To access objects in such additional schemas they can be fully qualified, e.g. my_schema.my_table, or by changing the
search_path
. - There are different types of SQL objects within a schema: 'relation'-like objects (table, view, materialized view, index, sequence, foreign-table), function, procedure, trigger, constraint, data type, domain, operator, and more.
- SQL objects in one schema are different from SQL objects in different schemas, even if they use the same name, e.g.: table t1 in my_schema1 is different from t1 in my_schema2.
- The names of 'relation'-like objects, data types, and domains are unique within their schema: e.g.: you cannot have a table employee and a view employee in the same schema.