Data Access04. Parallel Data Access
16
Copyright
© Postgres Professional, 2019–2024
Authors Authors: Egor Rogov, Pavel Luzanov, Ilya Bashtanov
Photo by: Oleg Bartunov (Phu monastery, Bhrikuti summit, Nepal)
Use of course materials
Non-commercial use of course materials (presentations, demonstrations) is
allowed without restrictions. Commercial use is possible only with the written
permission of Postgres Professional. It is prohibited to make changes to the
course materials.
Feedback
Please send your feedback, comments and suggestions to:
edu@postgrespro.ru
Disclaimer
In no event shall Postgres Professional company be liable for any damages
or loss, including loss of profits, that arise from direct or indirect, special or
incidental use of course materials. Postgres Professional company
specifically disclaims any warranties on course materials. Course materials
are provided “as is,” and Postgres Professional company has no obligations
to provide maintenance, support, updates, enhancements, or modifications.
2
Topics
Parallel Execution Plans
Process Pool Size
Parallel sequential scan
Parallel Index Access
3
Parallel Execution Plans
Leader Process
executes the sequential portion of the execution plan
Launches worker processes and gathers data from them.
Worker processes
Work simultaneously on the parallel portion of the execution plan.
Gather
parallel execution
execution plan
parallel execution
execution plan
parallel execution
execution plan
leader process
worker process worker process
Attaches
if no other work is available
tasks
sequential
execution plan
PostgreSQL supports parallel execution of queries. The main process
executing the query spawns (via postmaster, naturally) multiple worker
processes that simultaneously execute the same 'parallel' portion of the
execution plan. The results are then gathered at the Gather node by the
leader process.
If the worker processes can't keep up with supplying data to the main
process, the main process also joins the execution of the same parallel plan.
Of course, launching processes and data transfer require certain resources,
so not every query runs in parallel.
Besides, there are operations that simply can't be executed in parallel. Even
with the parallel mode enabled, the leader process will still execute some of
the steps alone, sequentially.
Note that PostgreSQL does not have another theoretically possible
parallelization mode where multiple processes act as a pipeline for data
processing (in other words, individual plan nodes are executed by separate
processes). PostgreSQL developers considered this mode inefficient.
4
Process Pool Size
Gather Gather
max_parallel_workers_per_gather = 2
max_parallel_workers = 8
max_worker_processes = 8
worker processes
parallel query execution other operations
Several parameters govern parallel execution.
First, let's look at the parameters that control the number of worker
processes.
The worker process mechanism is not only used for parallel query
execution. they are used by the logical replication mechanism and may be
created by extensions. Worker processes can be used in application code
(see the "Background Processes" topic in the DEV2 course for more
details). The total number of concurrently running worker processes is
controlled by the parameter max_worker_processes (default 8).
The number of concurrently running worker processes handling parallel
plans is limited by the max_parallel_workers parameter (default 8).
The number of concurrently running worker processes handling a single
leader process is limited by the max_parallel_workers_per_gather
parameter (default 2).
You may choose to change these values based on several factors: these
parameters should be adjusted based on hardware capabilities, data
volume, and system load. For instance, even if the database contains large
tables and queries could benefit from parallelization, but the system has no
free cores, parallel execution would be pointless.
5
Parallel Seq Scan
Parallel sequential scan
aggregation
Parallel Aggregation
Number of worker processes
Constraints
6
Parallel Seq Scan
4 3 5 2 1 10 12 8 7 11 4 9 6
Gather
Partial Aggregate
Partial Aggregate Partial Aggregate
Pages are read in sequence, but by different processes
Parallel Seq Scan
Parallel Seq Scan Parallel Seq Scan
N
U
L
L
Finalize Aggregate
Partial AggregatePartial Aggregate
An example of a node running in parallel mode is the Parallel Seq Scan —
"parallel sequential scan".
The name may seem contradictory, but it captures the essence of the
operation. Table pages are read in the same order as they would be during a
regular sequential scan. However, read requests are handled by several
processes running in parallel. Processes synchronize with each other to
ensure their requests are processed in the correct order.
The benefit of this approach is that parallel processes handle their pages
simultaneously. In order for the benefit to outweigh the overhead associated
with transferring data between processes, the processing must be
sufficiently resource-intensive. A good example is data aggregation, because
it demands significant CPU resources and only a single final number needs
to be transferred. In such cases, parallel query execution can take
significantly less time than sequential execution.
8
Number of worker
processes
Zero (no parallel plan generated)
If the table size is less than min_parallel_table_scan_size = 8MB
Fixed
If the 'parallel_workers' storage parameter is set for the table
Calculated using the formula
1 plus the floor of log base 3 of (table size divided by
min_parallel_table_scan_size)
At most max_parallel_workers_per_gather
How many worker processes will be used?
The planner will not consider parallel scans if the table's physical size is
below the min_parallel_table_scan_size parameter.
Below is the formula for calculating the number of planned worker
processes.Possible combinations are listed on this slide. It means that when
the table size triples, an additional process is added. For example, for the
default value of min_parallel_table_scan_size = 8MB:
таблица процессы таблица процессы
8MB 1 216MB 4
24MB 2 648MB 5
72MB 3 1.9GB 6
The number of processes can be explicitly set via the table's storage
parameter parallel_workers.
However, the number of processes is capped at the value of the
max_parallel_workers_per_gather parameter. If during query execution the
available number of processes is fewer than planned, only the available
ones will be used (up to sequential execution if the pool is exhausted).
10
Not parallelized
Write queries
Additionally, queries with row-level locking
Cursors
such as queries in a FOR loop in PL/pgSQL
Queries that use functions PARALLEL UNSAFE
Queries within functions that are called from a parallelized
query
Not every query can be parallelized.
Queries that modify or lock data—such as UPDATE, DELETE, and SELECT
FOR UPDATE—cannot be parallelized.
Queries whose execution can be paused are not parallelized—this includes
queries in cursors, such as FOR PL/pgSQL loops.
Queries cannot be parallelized if they include functions marked as
PARALLEL UNSAFE (parallelism notes are discussedin the 'Functions'
section).
Queries within functions invoked by a parallelized query cannot be
parallelized (to prevent recursive bloating).
Future PostgreSQL versions may remove some of these limitations.
11
Only executed sequentially
Reading the results of common table expressions (CTE)
Reading the results of subqueries not unrolled
Accessing temporary tables
Function callsPARALLEL RESTRICTED
Functions that utilize nested transactions
In general, the benefit of parallel planning depends mostly on how much of
the plan is parallel-compatible. However, certain operations do not impede
parallel execution but can only be executed sequentially in the main
process.
These include:
reading the results from common table expressions (subqueriesin the
WITH clause);
reading the results from other non-expandable subqueries (which appear
in the plan as nodes, such as SubPlan);
References to temporary tables (as they are accessible only to the
backend process);
function calls labeled as PARALLEL RESTRICTED
If a query invokes a function that uses subtransactions (such as a PL/pgSQL
function with exception handling), it will result in an error. Such functions
should be labeled as PARALLEL RESTRICTED. For more information on
parallelism annotations, see the "Functions" topic.
13
Parallel Index Scan
Parallel Index Scan
Parallel Index-Only Scan
Parallel Bit Map Scan
Number of worker processes
14
Parallel Index Scan
3 5 2 1 10 12 8 7 11 4 9 6
1 9
1 3 6 9 12
1 2 3 4 5 6 7 8 9 10 11 12
4
N
U
L
L
N
Descending to the leaf
executes a single
process
Index access can also be performed in parallel. This occurs in two steps.
First, the main process traverses from the tree root to the leaf page.
15
Parallel Index Scan
3 5 2 1 10 12 8 7 11 4 9 6
1 9
1 3 6 9 12
1 2 3 4 5 6 7 8 9 10 11 12
4
N
U
L
L
N
Then, worker processes perform parallel reads of the index's leaf pages
while traversing the list.
The process that read the index page also reads the required table pages.
This could result in multiple processes reading the same table page (as
shown on the slide: the last table page contains rows that are referenced by
multiple index pages read by different processes). Of course, the page will
be stored in the buffer cache as a single instance.
17
Parallel Index Only Scan
4 3 5 2 1 10 12 8 7 11 4 9 6
1 9
1 3 6 9 12
1 2 3 4 5 6 7 8 9 10 11 12
N
U
L
L
N
The page is present
within the visibility map
No check required
The page is missing
within the visibility map
table check
Index Only Scan sorting can be performed in parallel. This works the same
way as a regular index scan: the coordinator process descends from the root
to the leaf page, and then worker processes perform parallel scans of the
index's leaf pages, accessing the relevant table pages as needed to check
visibility.
19
Parallel Bitmap Heap Scan
3 5 2 1 10 12 8 7 11 4 9 6
1 9
1 3 6 9 12
1 2 3 4 5 6 7 8 9 10 11 12
4
N
U
L
L
N
Bitmap Scan can run in parallel
The first stage—index scan and bitmap construction—is always executed
sequentially by the leader process.
The second stage—table scan—is executed in parallel by worker processes.
This works similarly to parallel sequential scanning.
21
Number of worker
processes
Zero (no parallel plan generated)
If the sample size is less than 512kB, the min_parallel_index_scan_size is set
to 512kB.
Fixed
If the 'parallel_workers' storage parameter is set for the table
Calculated using the formula
1 + log₃(sample size / min_parallel_index_scan_size)
But not exceeding max_parallel_workers_per_gather
The number of worker processes is determined similarly to sequential
scanning. The data volume expected to be read from the index (determined
by the number of index pages) is compared to the value of the
min_parallel_index_scan_siz parameter (default 512kB).
In the case of a sequential table scan, the data volume is determined by the
size of the entire table. However, with index access, the planner needs to
estimate how many index pages will be read. The details of how this works
are covered in the "Basic Statistics" section.
If the sample size is too small, the optimizer won't consider a parallel plan.
For example, accessing a single value will never be parallelized — there's
nothing to parallelize in this scenario.
If the sample size is large enough, the number of worker processes is
calculated using a formula, unless it's explicitly set in the table's
parallel_workers parameter (not the index).
The number of processes will not exceed the
max_parallel_workers_per_gather parameter's value.
22
Takeaways
Parallel operations utilize CPU resources across multiple worker
processes.
PostgreSQL draws worker processes from a shared pool
All access methods support parallel execution
23
Practice
Execute a query that calculates the total number of bookings using
different planner settings and compare the execution plans:
1. Using default settings
2. Disabling sequential scans
3. Also disabling index-only scan
4. Enabling all access methods while disabling parallelism
1. 1. This refers to a query
EXPLAIN (costs off)
EXPLAIN (costs off)SELECT count(book_ref) FROM bookings;
2. 2. Disable the enable_seqscan parameter.
3. 3. Disable the enable_indexonlyscan parameter.
4. 4. Reset the parameters to their default values and set
max_parallel_workers_per_gather to 0.