PostgreSQL/WAL
WAL (Write Ahead Logging) files are files, where PostgreSQL stores changed data values in a binary format. It is additional information and in this respect it is redundant to the information in the database files. WAL files can be interpreted as a specific kind of 'diff' files.
WAL files are used for:
- Recreation of a consistent state of the database after a system crash
- Backup and restore with the technique Continuous archiving
- Replication
Usage
editWriting to WAL files is very fast as they are written always sequentially. This is particularly true for traditional discs with rotating panes and moving read/write heads. In contrast to WAL files database files are organized in special structures like trees, which possibly must be reorganized during write operations or which contain pointers to other blocks at far positions. Thus writes to database files are much slower.
For the mentioned performance reasons, when a client requests a write operation like UPDATE
or DELETE
the modifications to the data are done in a special sequence and - in some parts - asynchronously to the client requests. First, data is written and flushed to WAL files. Second, it is stored in shared buffers in RAM. Finally, it is written from shared buffers to database files. The client doesn't wait until the end of all operations. After the first two very fast actions, he is informed that his request is completed. The third operation is performed asynchronously at a later (or prior) point in time.
Removal
editWAL files are collected in the directory pg_wal (pg_xlog in PostgreSQL versions prior to version 10). Depending on the write activities on the database the total size of all WAL files may increase dramatically. Therefore the system must delete them when they are no longer needed. WAL files are available for deletion after the changes in the shared buffers (which correlate to the content of the WAL files) are flushed to the database files. As it is guaranteed that this criterion is met after a CHECKPOINT
, there are some dependencies between WAL file delete operations and CHECKPOINT
s:
- You can define a limit for the total size of all files in the directory:
max_wal_size
. If it is reached, PostgreSQL performs an automaticCHECKPOINT
operation. - You can define a
checkpoint_timeout
in seconds. No later than this number of seconds, PostgreSQL performs an automaticCHECKPOINT
operation.
In both cases the shared buffers get written to disc, a checkpoint-record is written to the actual WAL file and all older WAL files are ready to be deleted.
The deletion of WAL files may be prevented by other criteria, especially by failing archive commands. max_wal_size
is a soft limit and can silently be exceeded by the system in such situations.