Eliminating Bloat edit
As we have seen in the MVCC chapter, the database tends to occupy more and more disk space caused by bloat: over time more and more logically deleted but physically existing old versions of rows arise within heap and index files. This chapter explains how the SQL command VACUUM and the automatically running Autovacuum processes clean up files and thereby prevent their endless growth.
One process of the Instance is the Autovacuum daemon. It continuously monitors the state of all databases based on values that are collected by the Statistics Collector and starts Autovacuum processes whenever it detects certain situations, e.g.: a huge number of modifications within a table. This leads to the intended dynamic behavior of PostgreSQL: Only when it is necessary, Autovauum cleans up the files. In addition, client processes can issue the SQL command VACUUM at any time. DBAs do this interactively when they recognize critical situations, or they start it in periodically running batch jobs. In most cases, this is not necessary because of the constantly running Autovacuum daemon.
The central value to determine which of the physically existing row versions are no longer needed is
xmax, which shows what transaction has deleted the row. The elimination operation must evaluate it against several criteria which must all apply:
xmaxmust be different from zero because a value of zero indicates that the row version is not deleted.
xmaxmust contain an XID which is older than the oldest XID of all currently running transactions. That guarantees that no existing or upcoming transaction will have read or write access to this row version.
- The transaction of
xmaxmust be committed. If it is still running or was rollback-ed, this row version is treated as valid (not deleted).
- If there is a situation that the row version is part of multiple transactions, more actions must be taken.
When the vacuum operation detects such an outdated row version, it marks its space as free for future use of write actions, optimizes the physical arrangement of the remaining versions on the page, and removes index tuples pointing to the removed row version. But only in rare situations (or in the case of VACUUM FULL), this space is released to the operating system. In most cases, it remains occupied by the database and will be used by future INSERT, UPDATE or DELETE commands. Hence, even with successful running Autovacuum, the size of files does not shrink; but they have more respectively huger 'holes' where coming data can be stored. Only after this 'hole-space' is exhausted (per page), it gets necessary to claim new disc space from the operating system to store new data.
An exception to this conservative behavior is the SQL command VACUUM FULL. It creates a new file at the operating system level, copies all valid row versions to it with no extra space by ignoring the 'holes', and deletes the old file. But it is slower and requires an exclusive lock on the affected tables.
Because vacuum operations typically are I/O intensive, which can hinder other activities, Autovacuum avoids performing many vacuum operations in bulk. Instead, it carries out many small actions with time delays in between. The SQL command VACUUM runs immediately and without any time delay.
More Actions edit
VACUUM, as well as Autovacuum, don't just eliminate bloat. They perform additional tasks for minimizing future I/O activities of themselves as well as of other processes. This extra work can be done in a very efficient way since in most cases the expensive physical access to pages has taken place anyway. The additional operations are:
- Freeze: It marks certain row versions as frozen. This means that they are treated as 'valid' (visible) forever, independent from the wraparound problem (see later).
- Visibility Map and Free Space Map: It logs information about the state of the handled pages in two additional files, the Visibility Map and the Free Space Map.
- Statistics: Similar to the Statistics Collector it collects statistics about the number of rows per table, the distribution of values, and so on, as the basis for decisions of the query planner.