Backup
Overview
13
Copyright
© Postgres Professional, 2015–2022
Authors: Egor Rogov, Pavel Luzanov, Ilya Bashtanov
Translated by Alexander Meleshko
Use of course materials
Non-commercial use of course materials (presentations, demonstrations) is
allowed without restrictions. Commercial use is possible only with the written
permission of Postgres Professional. It is prohibited to make changes to the
course materials.
Feedback
Please send your feedback, comments and suggestions to:
Disclaimer
Postgres Professional assumes no responsibility for any damages and
losses, including loss of income, caused by direct or indirect, intentional or
accidental use of course materials. Postgres Professional company
specifically disclaims any warranties on course materials. Course materials
are provided “as is,” and Postgres Professional company has no obligations
to provide maintenance, support, updates, enhancements, or modifications.
2
Topics
Logical backup
Physical backup
3
Logical backup
What is logical backup
Table backup
Database backup
Cluster backup
4
Logical backup
SQL commands to restore data from scratch
+ can backup a separate object or database
+ can recover on a cluster running another major PostgreSQL version
+ can recover on a different architecture
− low speed
There are two types of backup: logical and physical.
Logical backup is a set of SQL commands that restores a cluster (or a
database, or a separate object) from scratch.
Such a backup is, in fact, a plain text file, which gives a certain flexibility.
For example, you can make a copy of only those objects that are needed;
you can edit the file by changing the names or data types, etc.
In addition, SQL commands can be executed on a different version of the
DBMS (if there is compatibility at the command level) or on a different
architecture (so binary compatibility is not required).
However, for a large database, this mechanism is inefficient, since executing
all the commands will take a long time. Moreover, it is possible to restore the
system from such a backup only to the moment at which the backup was
made.
5
COPY: table backup
Backup
output a table or a query into a file, console or another program
Recovery
insertion of rows from a file or console into an existing table
Server variant Client variant
SQL COPY command psql \COPY command
the file must be accessible the file must be accessible
to the postgres user to the user who has launched psql
on the server on the client
If you want to save only the contents of one table, you can use the COPY
command.
The command writes a table (or the result of an arbitrary query) either to a
file or to the console, or sends it as input to another program. You can
specify parameters such as format (plain text, csv or binary), field
separators, NULL representation etc.
The alternative variant of the COPY command reads fields from a file or the
console and inserts them into a table. The table isn’t truncated, the new
rows are simply appended to the existing ones.
The COPY command is significantly faster than similar INSERT commands,
because the client does not need to access the server repeatedly, and the
server does not have to parse the commands multiple times.
In psql, there is a client version of the COPY command with a similar syntax.
Unlike the server version, which is an SQL command, the client version is a
psql command.
The file name in the SQL command corresponds to a file on the database
server. The user running PostgreSQL (usually postgres) must have access
to this file. In the client version, the file is accessed on the client, and only
the content is transmitted to the server.
7
pg_dump: database backup
Backup
outputs either an SQL script or an archive in a special format with a TOC
outputs either to the console or to a file
supports parallel execution
can define what objects to backup
(tables, schemas, only DML or only DDL, etc.)
Recovery
SQL script via psql
custom format archive via pg_restore
(can define what objects to recover and supports parallel execution)
the new database must be created from template0 database roles and
tablespaces must be created in advance
The pg_dump utility creates a full-scale database backup. Depending on the
parameters, it provides either an SQL script containing commands that
create the required objects, or an archive in a custom format with a table of
contents.
Restoring using an SQL script is as simple as executing it in psql.
Restoring from an archive is done using the pg_restore tool. It reads the
archive and translates it into regular psql commands. The advantage is that
it allows you to specify what objects to restore at the recovery stage, not just
at the backup stage. Moreover, this type of backup and recovery supports
parallel execution.
The database for recovery must be created from the database template0,
since all changes made in template1 will also be backed up. In addition, the
necessary roles and tablespaces must be created in advance, since these
objects belong to the entire cluster. After recovery, it is recommended to run
the ANALYZE command to collect fresh statistics.
8
pg_dumpall: cluster backup
Backup
makes a backup of the entire cluster, including roles and tablespaces
outputs an SQL script to the console or to a file
parallel execution is not supported, but you can dump only the global objects
and then use pg_dump
Recovery
via psql
pg_dumpall creates a backup of the entire cluster, including roles and
tablespaces.
Since pg_dumpall requires access to all objects of all databases, it should
be ran by the superuser. pg_dumpall connects to each database in the
cluster one by one and makes a backup using pg_dump. In addition, it also
stores data related to the cluster as a whole.
The result of pg_dumpall is a script for psql. Other formats are not
supported. This means that pg_dumpall does not support parallel execution,
which can be a problem for larger clusters. In this case, you can use the
--globals-only key to backup only roles and tablespaces, and then backup all
the databases using pg_dump.
10
Physical backup
What is physical backup
Cold and hot backups
Replication protocol
Standalone backup
Continuous WAL archiving
11
Physical backup
Crash recovery mode: base backup + WAL
+ recovery speed
+ can recover to a certain point in time
− cannot recover a separate database, only the cluster as a whole
− can recover only on the same architecture and major PostgreSQL version
Physical backup uses the crash recovery mechanism. This requires:
a copy of cluster files (base backup),
a set of write-ahead logs needed to restore consistency.
If the file system is already consistent (the backup was made when the
server was stopped correctly), then the WALs are not required.
However, with WALs together with the base backup, you can recover the
system state at any point in time. This way, the database can be recovered
to the state right before the crash (or at any moment before that, if needed).
High recovery speed and the ability to create a backup on the fly without
stopping the server make physical backup the main choice for routine
backup needs.
12
Hot or cold
Cold backup Hot backup
The file system is
backed up when...
the server is off the server was shut
down incorrectly
the server
is running
WALs
not required required after the last
checkpoint
required for the
duration of the
backup creation
special tools
required
stored in the
file system server must not
delete WALs too soon
Physical backup creates a copy of the file system at some point.
If the copy is created while the server is stopped, it’s called a “cold backup”.
A cold backup either contains consistent data (if the server was shut down
correctly), or contains all the logs necessary for recovery (for example, if the
OS has done a data snapshot). This simplifies recovery, but requires that
the server is stopped.
If a copy is created while the server is running (which requires certain
additional actions since you can't copy files just like that), it is called a “hot
backup”. The process is more complicated, but can be performed without
stopping the server.
During a hot backup, the file system will be inconsistent. However, the crash
recovery mechanism can also be successfully applied to backup recovery.
This will require the WALs for at least the time it takes to back up the OS
files.
13
Standalone backup
A standalone backup contains both data files and WALs
Backup via pg_basebackup
connects to the server over the replication protocol
performs a checkpoint
copies the file system into a specified directory
saves all WAL segments generated during the copying process
Recovery
deploy the standalone backup
start the server
Hot backups are created with the pg_basebackup tool.
First, it performs a checkpoint. Then, the cluster file system is copied.
All WAL files generated by the server during the time from the checkpoint to
the end of file copying are also added to the backup. The resulting backup is
called standalone because it contains all the data necessary for recovery.
All you need to restore using a standalone backup is to deploy the backup
and start the server. It will use the WALs to recover consistency on startup if
necessary, and will be ready to go.
14
Replication protocol
Protocol
receiving the WAL stream
backup and recovery control commands
Managed by the wal_sender process
wal_level = replica
Replication slot
a server object that consumes WAL records
remembers which record was read last
WAL segments are not deleted until fully read through the slot
The replication protocol allows processes to connect to the server and
collect all WAL files generated during file copying. Despite the name, the
protocol is used not only for replication (which will be discussed in the next
topic), but also for backup. The protocol can stream WAL entries while data
files are being copied.
To prevent the server from deleting WAL files too early, the replication slot
can be employed.
Establishing a connection over the replication protocol requires a certain
configuration.
First, the initiating role must have the REPLICATION attribute (or be a
superuser). This role must also have the necessary permission in the
pg_hba.conf configuration file.
Second, the max_wal_senders parameter must be set sufficiently high. This
parameter limits the number of simultaneously running wal_sender
processes serving replication protocol connections.
Third, the wal_level parameter, which determines the amount of information
in the WAL, must be set to replica.
Starting from PostgreSQL 10, the default settings already satisfy all these
requirements (for a local connection).
15
Standalone backup
base backup
+
WAL
main server
WAL segments
select, insert
update, delete
p
g
_
b
a
s
e
b
a
c
k
u
p
The image on the left shows the main server. It processes incoming queries.
At the same time, WAL records are formed and the state of the databases
changes (first in the buffer cache, then on disk). WAL segments are
cyclically overwritten (that is, old segments are deleted as new ones are
created, since the file names are unique).
At the bottom of the picture is a backup archive (usually located on another
server). It contains a base copy of the data and a set of WAL files.
17
Recovery
main server
WAL segments
select, insert
update, delete
select, insert
update, delete
backup server
base backup
+
WAL
During recovery, the base backup, including the necessary WAL files, is
deployed, for example, on another server (shown on the right).
After the server starts, it restores consistency and is ready to go.
The system is recovered to the point in time when the backup was made.
Of course, the main server can go far ahead in the meantime.
19
WAL archive
File archive
WAL segments are archived as they are filled
controlled by the server
archiving happens with a delay
Streaming archive
a stream of WAL records is continuously recorded into the archive
external tools required
delays are minimal
The hot backup concept can be improved upon even further. Since we have
a copy of the file system and WALs, then by constantly saving new logs, we
will be able to restore the system not only at the time of copying files, but
also at any point in time after that.
There are two ways to go about it. The first is to archive old WAL files before
reusing them. There's a server setting for that. Unfortunately, with this
option, an incomplete WAL file will not be archived until the server switches
to writing to another WAL file.
The second way is to continuously read WAL entries using the replication
protocol and write them into the same archive. This way, delays are minimal,
but a separate tool is needed to receive the data stream and archive it.
20
WAL file archive
archiver process
Parameters
archive_mode = on
archive_command a shell command to copy a WAL segment
to a separate storage
Algorithm
when switching to a new WAL segment, the archive_command command is
called for the filled segment
if the command terminates with the status 0, the segment is deleted
if the command returns anything else (in particular, if the command is not
specified), the segment remains until the attempt is successful
The WAL files archive is managed by the archiver background process.
An arbitrary shell command to be used for copying can be defined in the
archive_command parameter. The mechanism itself is enabled by the
parameter archive_mode = on.
The algorithm goes as follows. When a WAL segment is filled, the copy
command is called. If it terminates with a 0 status, the segment can be
deleted safely. Otherwise, the segment (and the ones following it) will not be
deleted, and the server will periodically try to execute the command until it
returns 0.
21
WAL file archive
archive_command
WAL archive
+
base backup
main server
WAL segments
select, insert
update, delete
continuous
archiving
This figure shows the main server with continuous archiving set up. Filled
WAL segments are copied to a separate archive using the command defined
by the archive_command parameter. Usually, the archive is located on a
separate server, and it also stores the base backup (or several, from
multiple points in time).
22
Streamed WAL archive
pg_receivewal
connects over the replication protocol (can use a replication slot)
and streams WAL records into segment files
the starting position is the beginning of the segment following the last filled
segment in the directory, or the start of current segment, if the directory is
empty
unlike the file archive, records are added continuously
parameters have to be reconfigured when changing servers
Another solution is to use the pg_receivewal utility to write segments to the
archive using the stream replication protocol.
pg_receivewal usually runs on a separate “archive” server and connects to
the main server with the parameters specified in the command line keys.
It can (and should) use the replication slot in order to ensure that records
are not lost.
pg_receivewal generates files in the same way as the server does, and
writes them to the specified directory. Segments that have not yet been filled
up are written with the .partial prefix.
Archiving always starts from the beginning of the segment following the last
filled archive segment. If the archive is empty (first run), archiving starts
from the beginning of the current segment.
When switching to a new server, pg_receivewal must be stopped and
restarted with new parameters.
Note that the utility itself does not start automatically (as a service) and does
not run as a daemon.
23
Streamed WAL archive
WAL archive
main server
WAL segments
select, insert
update, delete
wal sender
pg_receivewal
pg_receivewal connects to the server over the stream replication protocol.
The connection is handled by a separate wal sender process (this must be
taken into account when setting the max_wal_senders parameter).
pg_receivewal records data without waiting for the entire segment to be
received.
24
Configured continuous WAL archiving
Backup via pg_basebackup
connects to the server over the replication protocol
performs a checkpoint
copies the file system into a specified directory
Recovery
deploy the backup
set configuration parameters
(command to read WAL from the archive, set the target recovery point)
create a recovery.signal file
start the server
Backup + archive
WAL segments
are not required
To create a backup with continuous archiving configured, the same
pg_basebackup tool is used, only with a different set of parameters. The
only difference is that the WAL files are not saved to the backup, since they
are already in the archive.
Recovery is more complicated in this case. In addition to deploying the base
backup, some recovery settings must be specified:
restore_command parameter (inverse of archive_command, copies files
from the archive to the server),
target recovery point.
In addition, a recovery.signal file is needed. If present at server startup, the
file tells the server to enter the managed recovery mode (the contents of the
file are ignored).
25
Recovery
restore_command
backup server
WAL archive
+
base backup
main server
WAL segments
didn’t make it
into the archive
The recovery procedure (for example, after main server crash) is performed
as follows. A base backup is deployed on another (or the same) server and
a recovery.signal file is created. The server starts up and starts reading WAL
segments from the archive using restore_command and applying them.
Note that during file archiving, the last unfilled WAL segment at the main
server will not be archived. However, the segment can be manually added to
the pg_wal directory on the backup server, if necessary. Of course, there
may be several such unarchived segments, but only in case of some kind of
failure during archiving.
26
Recovery
restore_command
backup server
WAL archive
+
base backup
main server
WAL segments
target
recovery point
The backup server reads WAL segments from the pg_wal directory and
applies them (in the absence of a segment, making an attempt to copy it
from the archive), ultimately bringing the state of the databases up to date.
The maximum possible loss is the last unfilled WAL segment that has not
been archived, and only if it cannot be copied manually for some reason.
By default, all available log entries are applied. If a target recovery point is
specified, recovery will stop after reaching it.
27
Recovery
select, insert
update, delete
archive_command
backup server
WAL archive
+
base backup
main server
WAL segments
After that, the backup server goes into normal operation: processing
incoming queries, archiving new WAL segments, and so on.
The recovered server can act as the primary server from now on, but in this
case it should be deployed on sufficiently powerful hardware in the first
place to avoid performance degradation.
28
Takeaways
Logical backup
creates SQL commands to recover the state of database objects
copy command, pg_dump and pg_dumpall utilities
Physical backup
creates a copy of the cluster files + a set of WAL files
pg_basebackup utility
WAL segments archive
file or stream
can restore the system to an arbitrary point in time
29
Practice
1. Create a database and a table in it with several rows.
2. Make a logical backup of the database using pg_dump.
Delete the database and restore it from the backup you made.
3. Make an standalone physical backup of the cluster using
pg_basebackup.
Modify the table.
Recover into a new cluster from the backup you made and verify
that the recovered database does not contain any of the later
changes.
3. A replica cluster has already been created in the training VM on port
5433. Use this cluster to recover into.
The cluster directory is located at /var/lib/postgresql/13/replica.
To connect, specify the port number: psql -p 5433
30
Practice+
1. Set up stream archiving on the main cluster using pg_receivewal.
2. Create an standalone backup of the main cluster (without WAL)
using pg_basebackup.
3. In the main cluster, create a database and a table in it.
4. Restore the replica cluster from the base copy using the archive.
Verify that the database and the table are also restored.
The replica cluster directory is /var/lib/postgresql/13/replica.
The current file recorded by pg_receivewal has the .partial suffix, and is
renamed at the end of recording. When recovering, use the partial file along
with the usual segments.
To connect to the replica cluster, specify the port number: psql -p 5433