Query OptimizationApproaches to Configuration
16
Copyright
© Postgres Professional, 2019–2024
Authors Authors: Egor Rogov, Pavel Luzanov, Ilya Bashtanov
Photo by: Oleg Bartunov (Phu monastery, Bhrikuti summit, Nepal)
Use of course materials
Non-commercial use of course materials (presentations, demonstrations) is
allowed without restrictions. Commercial use is possible only with the written
permission of Postgres Professional. It is prohibited to make changes to the
course materials.
Feedback
Please send your feedback, comments and suggestions to:
edu@postgrespro.ru
Disclaimer
In no event shall Postgres Professional company be liable for any damages
or loss, including loss of profits, that arise from direct or indirect, special or
incidental use of course materials. Postgres Professional company
specifically disclaims any warranties on course materials. Course materials
are provided “as is,” and Postgres Professional company has no obligations
to provide maintenance, support, updates, enhancements, or modifications.
2
Topics
What Should Be Configured?
Server Configuration
Application Configuration
Queries
3
What Should Be Configured?
Hardware or a virtual environment
Operating System
D
a
t
a
b
a
s
e
M
a
n
a
g
e
m
e
n
t
S
y
s
t
e
m
data
planner
queries
application
The performance of an information system is ensured at different levels.
We do not consider two important levels: hardware (which can be
virtualized, thereby introducing an additional layer to the architecture) and
the operating system. Specifically, the database management system
(DBMS) interacts with the file system, and the efficiency and reliability of I/O
operations rely on the disk subsystem. First and foremost, we need to
ensure the necessary resources and configure the OS to maximize their
utilization.
Database server is configured to ensure that administrative tasks don't
consume more resources than necessary and that most queries run
efficiently. At this level, we can not only configure configuration parameters
(a small portion of which will be discussed) but also manage data
placement. Our goal is to ensure effective resource allocation and proper
planner operation.
The next step is configuring individual queries. In the "Profiling" section, we
discussed how developers typically focus on optimizing queries they are
currently writing, while the administrator focuses on optimizing queries that
have the greatest impact on the server as a whole. In this section, we'll
explore some query optimization techniques.
However, the queries that make up the workload are initiated by the
application. Therefore, application configuration is just as important as
database management system configuration.
4
Server Configuration
Resource usage
Physical Data Placement
Optimizer Statistics and Configuration
5
Resources: CPU and
Input/Output
Background Worker Processes
max_parallel_workers_per_gather = 2
max_parallel_workers = 8
max_worker_processes = 8
Input/Output
effective_io_concurrency = 1
maintenance_io_concurrency = 10
We'll begin with the settings that determine which resources the DBMS can
utilize.
PostgreSQL can utilize multiple CPU cores for parallel processing. This
involves launching additional background worker processes. The detailed
discussion of parallel query execution can be found in the "Parallel Access"
section, while the use of background processes for application development
is covered in the "Background Processes" section of the DEV2 course.
Among the input-output related settings, let's highlight the parameter
effective_io_concurrency.It tells the system how many individual disks are in
the disk array. Actually, this parameter only affects the number of pages pre-
fetched into the cache during a bitmap scan.
A similar parameter used for some maintenance operations is referred to as
maintenance_io_concurrency.
6
Memory Resources
Memory
work_mem
filter: (departure_airport = $2)
maintenance_work_mem
shared_buffers
effective_io_concurrency = 4
Several settings affect memory allocation and usage.
The work_mem parameter determines the amount of memory available for
specific operations (discussed in related sections).
It also influences the selection of the plan. For example, with smaller values,
sorting is preferred as it performs better with limited memory compared to
hashing. Therefore, for nodes using hashing, the working memory size can
be increased by adjusting the hash_mem_multiplie parameter.
The maintenance_work_mem parameter impacts the index building speed
and the operation of maintenance processes.
The shared_buffers parameter determines the size of the buffer cache for
the instance. The configuration of this parameter is discussed in the DBA2
course, but it's clear that the default value is very small.
The effective_cache_size parameter tells PostgreSQL the total amount of
cacheable data, including both the buffer cache and the OS cache. The
higher its value, the more index access is preferred. This parameter does
not impact the actual memory allocation.
7
Physical Location
Tablespaces
Distributing data across multiple physical devices
ALTER TABLESPACE SET
Partitioning
Partitioning a table into independently manageable parts to simplify
administration and improve access speed
Sharding
Distributing partitions across multiple servers to scaleread and write
workload
Partitioning and extending PostgreSQL FDW or third-party solutions
The physical organization of data can greatly impact performance.
Tablespaces allow you to control the placement of objects on physical I/O
devices. For example, store frequently accessed data on SSDs and archival
data on slower HDDs.
Some server parameters that depend on storage device characteristics
(such as random_page_cost) can be set at the tablespace level.
Partitioning enables efficient handling of very large data volumes. The main
performance benefit lies in replacing a full table scan with a scan of an
individual partition. Note that partitions can also be stored in different
tablespaces.
Another approach is to place partitions on different servers (sharding),
enabling distributed queries. Standard PostgreSQL includes only the core
features required for sharding: partitioning and the postgres-fdw extension.
Sharding can be more effectively and fully implemented using external
solutions. This topic is covered in the final section of the DBA3 course.
8
Optimizer: Statistics
Relevance
Autovacuum and Autoanalyze Settings
autovacuum_max_workers, autovacuum_analyze_scale_factor, …
Accuracy
default_statistics_target = 100
Expression Index, Extended Statistics
The planner relies on statistics for its estimates, so frequent collection is
essential. This is accomplished by adjusting the autovacuum and
autoanalyze settings. The settings are discussed in detail in the DBA2
course.
Additionally, the statistics should be accurate enough. Absolute accuracy
cannot be achieved (and isn't necessary), but errors should not result in
incorrect execution plans. A sign of outdated or inaccurate statistics is a
significant discrepancy between the expected and actual row counts in the
leaf-level nodes of the execution plan.
To enhance accuracy, you might need to adjust the default_statistics_target
setting, either globally or for individual table columns. An expression index
with its own statistics can sometimes be useful. In certain situations,
extended statistics can be useful.
9
Optimizer: Cost
Input/Output
seq_page_cost
random_page_cost
CPU Time
cpu_tuple_cost, cpu_index_tuple_cost, cpu_operator_cost
User function cost
↓ for SSD
There are numerous cost parameters for basic operations, from which, as
we've already seen, the query plan's cost is ultimately determined. It's
advisable to adjust these settings if queries, despite accurate cardinality
estimation, are not performing efficiently.
The seq_page_cost and random_page_cost parameters relate to I/O and
determine the relative cost of reading a page during sequential or random
access.
The seq_page_cost parameter is set to one and shouldn't be modified. A
high value for the random_page_cost parameter reflects the realities of HDD
disk operations. For SSD drives (as well as when all data is highly likely to
be cached), this parameter should be significantly reduced, for example, to
1.1.
CPU time parameters determine the weights used to account for the cost of
processing retrieved data. Usually, these parameters aren't changed
because it's hard to predict the impact of changes.
But, as discussed in the "Functions" section, there are cases where defining
the cost of a user-defined function in terms of cpu_operator_cost can be
beneficial.
11
Application Configuration
Clients and Connections
Data schema
12
Clients and Connections
Multiple Sessions
connection pool
Cursors
if a portion of the sample is required
cursor_tuple_fraction
Multiple short queries
prepared operators
Moving logic to SQL
If an application opens too many connections or establishes them too
frequently for short durations, it may be necessary to implement connection
pooling between the application and the DBMS. However, the pool places
limitations on the application. This question is covered in detail in the DEV2
course.
Cursors should be used only when retrieving a small portion of a result set,
and the size depends on user actions. In this case, the cursor_tuple_fraction
parameter will guide the planner to optimize fetching the initial rows of the
result set.
If the application executes too many small queries (each of which is efficient
individually), the overall performance will be poor. This is commonly
observed when using object-relational mapping (ORM) tools. In the
"Profiling" topic, we discussed a similar example involving a function that
was executed in a loop.
In such cases, the DBMS has virtually no optimization tools (except for
prepared statements when queries are repeated). A proven approach is to
eliminate procedural code in the application and move it to the database
server as a small set of large SQL commands. This enables the query
planner to utilize more efficient access and join methods, while reducing the
frequent data transfers between the client and server. A proven approach is
to eliminate procedural code in the application and move it to the database
server as a small set of large SQL commands. This enables the query
planner to utilize more efficient access and join methods, while reducing the
frequent data transfers between the client and server.
13
Data schema
Normalization - removal of data redundancy
Simplifies queries and consistency checks
Denormalization involves introducing redundancy
may enhance performance, but requires synchronization
indexes
precomputed fields (such as generated columns or triggers)
materialized views
Caching application results
At the logical level, a database should be normalized—informally, this
means eliminating redundancy in the data. If that's not the case, we're
dealing with a design flaw: data consistency checks will be challenging, and
various anomalies can occur during data changes, and so on.
While data duplication at the storage level can significantly improve
performance, this comes at the cost of maintaining synchronization between
redundant data and primary data.
The most common method of denormalization is indexes—though they're
typically not considered in this context. Indexes are automatically updated.
You can duplicate some data (or calculated results based on this data) in
table columns. You can use generated columns or keep the data in sync with
triggers. Regardless of the approach, the database is responsible for
denormalization.
Another example is materialized views. They must also be updated, for
example, on a schedule or through other methods. This topic was thoroughly
covered in the "Materialization" section.
Data can also be duplicated at the application level by caching query results.
This is a common approach, but it's often used due to the application's
improper interaction with the database (e.g., when using ORM). The
application is responsible for timely cache updates, ensuring access controls
for cache data, and other related tasks.
14
Data schema
Data Types
Selecting the right data types
Using composite types, such as arrays and JSON, instead of separate tables.
Data Integrity Constraints
Beyond ensuring data integrity, the planner may take into account
eliminating unnecessary joins, enhancing selectivity estimates, and
performing other optimizations.
A primary key enforces uniqueness through a unique index.
foreign key
Absence of null values
CHECK constraint (constraint_exclusion)
Choosing the right data types from the wide range of options PostgreSQL
offers is crucial. For instance, representing date intervals using range types
(daterange, tstzrange) instead of two separate columns enables the use of
GiST and SP-GiST indexes for operations like interval intersections.
In certain situations, employing composite types like arrays or JSON instead
of the traditional method of creating a separate table can yield benefits. This
reduces the need for joins and avoids storing extensive metadata in row
version headers. However, such a solution should be approached with
caution, as it has its own drawbacks.
Integrity constraints are important in their own right, but the planner can
sometimes leverage them for optimization.
Primary key and unique constraints are enforced via unique indexes. This
enables more accurate statistics and more efficient join operations.
The presence of a foreign key and NOT NULL constraints allows for
eliminating unnecessary joins (particularly important when using views)
and also improves selectivity estimation when joining on multiple
columns.
A CHECK constraint using the constraint_exclusion parameter can avoid
scanning tables (or partitions) when they are guaranteed to contain no
relevant data.
16
Queries
Optimization Strategies
Short queries
Long-running queries
17
Optimization Strategies
The goal of optimization is to achieve an effective plan.
Addressing Inefficiencies
somehow identify and fix the bottleneck
It's often hard to pinpoint the issue
often leads to a conflict with the planner
Accurate Cardinality Calculation
Ensure accurate cardinality calculation in each node and rely on the planner
If the plan is still inadequate, tune the global parameters
The goal of query optimization is to achieve an adequate execution plan.
There are various ways to achieve this goal.
You can examine the query plan, identify the cause of inefficient execution,
and take steps to address the issue. Unfortunately, the problem isn't always
obvious, and fixing it often ends up being a battle with the optimizer.
If taking this approach, you'd want the option to fully or partially disable the
planner and manually create the execution plan. This capability is referred to
as hints and is not explicitly available in PostgreSQL.
Another approach involves ensuring accurate cardinality estimation at each
node in the execution plan. While accurate statistics are certainly necessary,
they're often not enough.
If we take this approach, we're not battling the planner but helping it make
the right decision. Unfortunately, this often proves to be too complex a task.
If the planner still creates an inefficient plan despite accurate cardinality
estimates, it's time to adjust the global configuration parameters.
It's usually worthwhile to use both approaches, based on the situation and
applying common sense.
18
Short queries
Read a small amount of data and return few rows.
Typical of OLTP
It's important to get the answer quickly
Key Features of the Plan
high-selectivity conditions
indexes
nested loop joins
Queries can be somewhat arbitrarily divided into two groups, each with
distinct characteristics and requiring different optimization approaches:
"short" and "long".
The first group consists of queries typical of OLTP systems. Such queries
retrieve limited data and return one or more rows. They can access large
tables, but when they do, they employ high-selectivity conditions that allow
only a small portion of the data to be retrieved.
Short queries typically handle the user interface, making it crucial that they
run as fast as possible. Response time is prioritized.
This is achieved by reading the data needed for the queries either from very
small tables (one or two pages) or using an index. Typically, joins are
performed using a nested loop, which doesn't need any setup (unlike hash
joins) and doesn't run into problems when joining a small number of rows.
20
Long-running queries
Read large tables in full
Common to OLAP
It's important to avoid reading the same data multiple times
Key Features of the Plan
low selectivity conditions
full table scan
hash join
aggregation
parallel execution
Long queries are common in OLAP systems. They involve reading a large
amount of data to produce the result. The number of rows returned by the
query is irrelevant—aggregation may leave even a single row.
Such queries often involve reading entire large tables, as their conditions
typically have low selectivity. Therefore, sequential scanning becomes more
efficient than index access, and hash joins replace nested loop joins.
Especially because, for a long query, delivering the entire result within a
reasonable time is more important than getting the first rows as soon as
possible.
It's very important to ensure that the same data isn't read multiple times in
the query. This can occur for various reasons (such as correlated subqueries
or the use of functions, among others), which result in explicit or implicit
nested loops.
Of course, a long query's plan can involve index access (when high-
selectivity conditions are present) and nested loop joins (when joining a
small number of rows).
Since long queries handle large amounts of data and typically aggregate it
(as users rarely need millions of rows), they can benefit from parallel
execution.
22
Optimizer Hints
Not explicitly present
While there are ways to influence execution, such as configuration
parameters, CTE materialization, and other techniques.
Third-Party Extensions
pg_hint_plan
Another (traditional in other DBMS) way to influence—optimizer hints—is not
present in PostgreSQL. This is a core decision made by the
Actually, some hints are still implicitly present in the form of configuration
parameters and other mechanisms.
Additionally, there are specific extensions, such as
Horiguchi). Don't forget that using hints that severely restrict the planner's
flexibility can backfire if data distribution changes in the future.
23
Takeaways
System configuration is handled at multiple levels
There are a variety of ways to influence the query execution
plan
Different query types require different approaches
Nothing can replace a sharp mind and good judgment
24
Practice
1. Optimize the query that retrieves contact information for passengers
who bought business class tickets on flights delayed by more than 5
hours: SELECT t.*FROM tickets t JOIN ticket_flights tf ON
tf.ticket_no = t.ticket_no JOIN flights f ON f.flight_id =
tf.flight_idWHERE tf.fare_conditions = 'Business' AND
f.actual_departure > f.scheduled_departure + interval '5
hour';Optimize the query that retrieves contact information for
passengers who bought business class tickets on flights delayed by
more than 5 hours: SELECT t.*FROM tickets t JOIN ticket_flights tf
ON tf.ticket_no = t.ticket_no JOIN flights f ON f.flight_id =
tf.flight_idWHERE tf.fare_conditions = 'Business' AND
f.actual_departure > f.scheduled_departure + interval '5 hour';
2. Optimize the query that computes the average ticket price for flights
operated by various aircraft types.Begin with the version proposed in
the comment.
2.
SELECT a.aircraft_code, ( SELECT round(avg(tf.amount)) FROM
flights f JOIN ticket_flights tf ON tf.flight_id = f.flight_id
WHERE f.aircraft_code = a.aircraft_code)FROM aircrafts a