Replication
Overview of Physical Replication
16
Copyright
© Postgres Professional, 2017–2025
Authors: Egor Rogov, Pavel Luzanov, Ilya Bashtanov, Igor Gnatyuk
Translated by: Liudmila Mantrova, Alexander Meleshko, Elena Sharafutdinova
Photo: Oleg Bartunov (Phu Monastery and Bhrikuti Peak, Nepal)
Use of Course Materials
Non-commercial use of course materials (presentations, demonstrations) is
allowed without restrictions. Commercial use is possible only with the written
permission of Postgres Professional. It is prohibited to make changes to the
course materials.
Feedback
Please send your feedback, comments and suggestions to:
edu@postgrespro.ru
Disclaimer
Postgres Professional assumes no responsibility for any damages and losses,
including loss of income, caused by direct or indirect, intentional or accidental
use of course materials. Postgres Professional company specifically disclaims
any warranties on course materials. Course materials are provided “as is”, and
Postgres Professional company has no obligations to provide maintenance,
support, updates, enhancements, or modifications.
2
Topics
Replication Types
Physical Replication
Log Levels
Replication Use Cases
Switching to Standby
3
Replication Types
Data synchronization between servers
Purposes
fault tolerance, high availability
scalability
Physical replication
synchronization at the level of pages and row versions
Logical replication
synchronization at the level of table rows
A single database server may not meet the requirements.
A single server is a potential point of failure. Two (or more) servers allow the
system to maintain availability in case of a failure (fault tolerance) or, more
broadly, in any scenario, such as during scheduled maintenance (high
availability).
One server may not be able to handle the load. Scaling up (upgrading server
resources) may be inefficient or even impossible. However, the workload
can be distributed across multiple servers (scaling).
Database systems can access shared data.
The solution is to have multiple servers managing the same data.
Replication refers to the process of synchronizing this data.
Depending on the level at which synchronization occurs, there are two types
of replication: physical replication synchronizes changes at the data page
level and transaction statuses and logical replication synchronizes changes
at the table row level.
4
Physical Replication
Mechanism
one server transfers WAL records to another server, and the second server
replays the received records
Features
primary-standby: data flow in one direction only
binary server compatibility is required
only the cluster as a whole can be replicated
The idea of physical replication is that one server transfers WAL records to
another server, and the second server replays the received records like in
crash recovery.
During physical replication, servers have assigned roles: primary and
standby. The primary transfers WAL records to standby (in the form of files
or a stream of records). The standby applies these records to its data files.
The WAL record application is purely mechanical, without “understanding
the meaning” of the changes, so binary compatibility between servers is
important (the same platform and major PostgreSQL version). Since the
WAL is shared across the entire cluster, only the cluster as a whole can be
replicated.
5
Physical Replication
main server
(primary)
WAL segments
select, insert
update, delete
backup server
(standby)
wal sender
wal receiver
startup
WAL archive
alternative
WAL source
streaming
replication
To set up replication between two servers, we create a replica from a
physical backup of the primary server. Normally, restoring such a backup
would create a new independent server. However, when replication is
enabled, the standby server operates in continuous recovery mode: it
constantly applies new WAL records received from the primary server
(handled by the startup process). This way, the replica is constantly
maintained in an almost up-to-date state.
There are two ways to deliver WALs from the primary to the standby. The
one used more commonly in production is streaming replication.
In this case, the replica (walreceiver process) connects to the primary
(walsender process) via the replication protocol and receives the WAL
record stream. This minimizes the replica lag and can even eliminate it
entirely (in synchronous mode).
If the system is set for continuous archiving, file-based replication is
possible. In this case, the replica will lag noticeably as the file archive is
updated only when a WAL segment is switched.
In practice, file-based replication is used in addition to streaming replication.
If the replica cannot receive the next WAL entry via the replication protocol, it
will try to read it from the archive.
6
WAL Levels
wal_level parameter
minimal < replica
crash recovery crash recovery
restore from backup,
replication
Since standby only receives the information contained in the WAL, all data
necessary for synchronization shall be recorded into the WAL.
The amount of data stored in each WAL record is controlled by the wal_level
parameter.
Prior to PostgreSQL 10 the default level was minimal, which guaranteed
only crash recovery. Replication cannot function at this level because some
changes are directly written to persistent storage for reliability, bypassing
WAL.
In PostgreSQL 10+ the default level is replica. At this level all data changes
are recorded into the WAL, that enables restoring the system from
pg_basebackup hot backups, as well as physical streaming replication.
As backup and replication are highly-demanded features, the default level
was switched to replica.
8
Standby Usage
Allowed
read-only queries (SELECT, COPY TO, cursors)
setting server parameters (SET, RESET)
transaction management (BEGIN, COMMIT, ROLLBACK...)
creating a backup (pg_basebackup)
Not allowed
any changes (INSERT, UPDATE, DELETE, TRUNCATE, nextval...)
locks expecting changes (SELECT FOR UPDATE...)
DDL commands (CREATE, DROP...), including creating temporary tables
maintenance commands (VACUUM, ANALYZE, REINDEX...)
access control (GRANT, REVOKE...)
triggers and advisory locks do not work
By default, the standby operates in the hot standby mode. In this mode,
client connections are allowed but restricted to read-only operations. Setting
server parameters and transaction management commands will also work.
For example, you can start a (read-only) transaction with a specific isolation
level.
In addition, the standby can also be used for making backups (taking into
account the possible lag behind the primary).
In hot standby mode, no data changes (including sequences), locks, DDL
commands, commands such as VACUUM, ANALYZE and REINDEX or
access control commands are allowed on the standby. Basically, anything
that changes the data in any way is not accepted.
If required, the standby can be run in warm standby mode by setting the
parameter hot_standby = off. In this case, client connections will be
completely disabled.
10
Standby Usage
data storage reliability
main server
(primary)
WAL segments
select, insert
update, delete
backup server
(standby)
wal sender
wal receiver
startup
synchronous
replication
The replication mechanism offers flexible system design options for a variety
of applications. Let’s consider several typical cases and possible solutions.
One of the key objectives is ensuring data storage reliability.
As a reminder, transaction commits can operate in synchronous or
asynchronous modes. In synchronous mode, the commit is not completed
until the data is safely written to persistent storage. In asynchronous mode,
there is a risk of losing some committed data, but commits do not wait for
disk writes, improving system performance.
A similar principle is applied to replication: in synchronous mode
(synchronous_commit = on), when a standby is present, the commit waits
not only for the WAL to be written to disk but also for confirmation that the
WAL records have been received by the synchronous standby. This further
enhances reliability (ensuring data is not lost even if the primary server fails)
but also increases latency, slowing down the system.
There are also intermediate configuration options that do not provide
absolute reliability guarantees but still reduce the risk of data loss.
11
Standby Usage
long-running analytical queries (reports)
main server
(primary)
WAL segments
select, insert
update, delete
backup server
(standby)
wal sender
wal receiver
startup
standby
may fall behind,
conflicting records
are deferred
conflicts:
removal of row versions
by vacuuming and
exclusive locks
As mentioned earlier, long-running queries hold back the vacuum horizon,
preventing the removal of obsolete row versions. If certain tables are being
actively modified during this time, they can grow significantly in size.
This is why standbys are often used for long-running analytical queries.
A subtle issue arises when WAL records from the primary server conflict with
queries running on the standby. There are two main sources of such
records:
1. The primary server removes row versions that are no longer needed there
but are still required by queries on the standby.
2. Exclusive locks on the primary that are incompatible with queries on the
standby.
So, standby for reports is typically configured to accept WAL records from
the primary but delay their application if they conflict with running queries.
This means the standbys data may lag behind the primary, but for analytical
workloads, this is acceptable.
12
Multiple Stanbys
main server
(primary)
WAL segments
select, insert
update, delete
backup server
(standby A)
wal sender
backup server
(standby B)
wal sender
wal receiver
startup
wal receiver
startup
f
e
e
d
b
a
c
k
f
e
e
d
b
a
c
k
Multiple standbys can be connected to the primary server to distribute OLTP
read workloads.
OLTP queries should not be long-running. This enables effective use of
replication protocol feedback between standbys and the primary server. In
this case, the primary server maintains awareness of the transaction horizon
required by queries on standbys, preventing vacuum from removing needed
row versions. Essentially, this feedback mechanism achieves the same
result as if all queries were executing locally on the primary server.
However, replication provides only the basic mechanism. External tools
(load balancers) are required for automatic workload distribution. It is
important to note that data consistency between the primary and standbys is
not guaranteed — even in synchronous replication. Applications reading
from a single server will, of course, maintain consistency, but consistency is
no longer guaranteed when reading from multiple servers simultaneously.
Replicas may return either stale data or changes not yet visible on the
primary. These topics are discussed in detail in the DBA3 Backup and
Replication course.
13
no additional load on the primary and redistribution of network traffic
Cascading Replication
main server
(primary)
WAL segments
select, insert
update, delete
backup server
(standby B)
wal sender
backup server
(standby A)
wal senderwal receiver
startup
wal receiver
startup
Multiple standbys connected to a single primary server will generate
additional load on it. Network load should also be considered when
transmitting multiple copies of the WAL stream.
To reduce this load, standbys can be arranged in a cascade configuration,
where servers relay WAL records to each other in a chain. The further
downstream from the primary, the greater potential lag may accumulate in
the replicated data.
Note that cascaded synchronous replication is not supported — the primary
can only synchronize with directly connected standbys. However, the
primary collects feedback from all standbys in the cascade.
14
“time machine”
can recover to a specific point in time without WAL archive
Delayed Replication
WAL application delay
main server
(primary)
WAL segments
select, insert
update, delete
backup server
(standby)
wal sender
wal receiver
startup
A useful feature is the ability to view data at and recover to an arbitrary point
in time. It is particularly useful for recovering from user errors where
incorrect actions need to be rolled back.
The regular archive-based point-in-time recovery mechanism can work here,
but it requires a lot of preparation and takes a lot of time. And PostgreSQL
itself does not allow to make data snapshots for a given moment in the past.
The solution is to have a standby apply WAL records not immediately, but
with a certain delay.
In this course, we do not cover the required configurations for each of the
provided options. For detailed information, refer to the DBA3 Backup and
Replication course.
15
Switching to Standby
Scheduled switchover
shutdown of the main server for maintenance without interruption of service
manual mode
Emergency switchover
switch to a standby due to a primary server failure
manual mode,
but can be automated with external cluster software
An existing standby can be used to replace the primary server.
There are different reasons for switchover to a backup server. If it is
maintenance time on the primary, the switchover can be performed routinely
at a convenient time. If it is the primary failure, on the other hand, the
switchover has to be performed as quickly as possible to avoid service
downtime.
Even in case of a failure, switchover is performed manually unless
specialized cluster software is used to monitor server status and initiate the
switchover automatically.
17
Takeaways
Physical replication mechanism works by delivering
WAL records to the standby and applying them there
streaming WAL records or transferring files
Physical replication creates an exact copy of the entire cluster
unidirectional, requires binary compatibility
core mechanism for solving multiple use cases
19
Practice
1. Set up physical streaming replication between the two servers in
synchronous mode. Verify that replication works as intended.
Make sure that when the standby is stopped, commits on the
primary are not completed.
2. By default, conflicting WAL records on the standby are delayed
for up to 30 seconds. Disable the delay and verify that long-
running queries on the standby are canceled if the primary
deletes and vacuums required row versions.
Then enable feedback and confirm that it prevents cancellations
by delaying primary vacuums.
1. To do this, set the following parameters on the primary using ALTER
SYSTEM:
synchronous_commit = on
synchronous_standby_names = '"16/replica"'
2. The max_standby_streaming_delay parameter defines how long the
standby will wait for conflicting queries to complete before canceling them.
Set it to 0. Enable feedback by setting hot_standby_feedback = on.
Apply both settings using ALTER SYSTEM and reload the configuration.
To simulate a long-running query on a small dataset, use pg_sleep() in
queries.