Query OptimizationProfiling
16
Copyright
© Postgres Professional, 2019–2024
Authors Authors: Egor Rogov, Pavel Luzanov, Ilya Bashtanov
Photo by: Oleg Bartunov (Phu monastery, Bhrikuti summit, Nepal)
Use of course materials
Non-commercial use of course materials (presentations, demonstrations) is
allowed without restrictions. Commercial use is possible only with the written
permission of Postgres Professional. It is prohibited to make changes to the
course materials.
Feedback
Please send your feedback, comments and suggestions to:
edu@postgrespro.ru
Disclaimer
In no event shall Postgres Professional company be liable for any damages
or loss, including loss of profits, that arise from direct or indirect, special or
incidental use of course materials. Postgres Professional company
specifically disclaims any warranties on course materials. Course materials
are provided “as is,” and Postgres Professional company has no obligations
to provide maintenance, support, updates, enhancements, or modifications.
2
Topics
Profiling as a tool for identifying bottlenecks
Selecting a Subtask for Profiling
Profiling Tools
3
Tool
Profiling
Selecting Subtasks
Duration
execution count
What Should Be Optimized?
The larger the subtask's portion of the total execution time, the higher the
potential gain.
One must consider the costs of optimization.
It's beneficial to approach the task from a broader perspective.
In previous sections, we've explored how queries work, the components of
an execution plan, and the factors that influence the selection of a specific
plan. This is the most challenging and crucial part. Once you understand the
mechanisms, you can analyze any situation that arises using logic and
common sense to assess whether the query is executing efficiently and
identify ways to improve performance.
But how do you find the query that's worth optimizing?
Generally, addressing any optimization task (not limited to DBMS contexts)
begins with profiling, although this term is not always used explicitly. We
should break down the task causing issues into subtasks and measure how
much of the total time they take. It's also useful to know how often each
subtask is executed.
The larger the share of the subtask in the overall runtime, the more
significant the performance improvement from optimizing it. In practice, you
also have to account for the expected costs of optimization—getting the
potential benefit can be challenging.
There are cases where a subtask is executed quickly but frequently (for
example, common in queries generated by ORMs). It might be impossible to
speed up individual queries, but consider whether the subtask needs to be
executed so frequently? Such a question might lead to the need for
architectural changes, which, while challenging, can ultimately yield
substantial benefits.
4
What to Profile
System activity in its entirety
Beneficial for the administrator to identify resource-intensive tasks
Monitoring and the pg_profile extension
A specific task drawing criticism
Helpful in addressing a specific issue
The more precise, the better: broader coverage blurs the picture.
An overall activity profile can provide valuable insights to a database
administrator who is not focused on addressing a specific issue but rather
identifying the most resource-intensive tasks, the optimization of which could
significantly reduce system load.
The monitoring system should display such a profile.
Another option is to use the pg_profile extension. It is built upon
PostgreSQL's statistical views, with the data stored in the activity snapshots
repository. By analyzing and comparing snapshots, you can identify issues
and their underlying causes. The extension was created by Andrei
Explaining in detail how to build a monitoring system, which tools to use, and
which metrics to collect is outside the course's scope, but we recommend
Alexey Lesovskiy's "PostgreSQL Monitoring" book:
To solve a specific problem, you need to build a profile that targets only the
actions necessary to reproduce this issue. For instance, if a user reports that
"the window opens for a full minute," there's no point in analyzing all
database activity during that period—those metrics will include actions
unrelated to the window opening.
5
What to Measure
Execution Time
It makes sense for the user
Highly unstable metric
Page I/O
Unaffected by external factors
Means little to users
What units should be used to measure resources? The most meaningful
metric for the end user is response time—the time between pressing a
button and receiving a result.
However, from a technical perspective, time isn't always the most practical
metric to focus on. It is heavily influenced by numerous external factors,
such as cache utilization and current server load. If the issue is analyzed on
a test server with different characteristics instead of the main server, it
introduces variations in hardware, configurations, and workload profiles.
In this regard, considering I/O—that is, the number of read and written
pages—might be more practical. This metric is more stable. It is not affected
by hardware specifications or server load, so it will produce consistent
results across different servers with identical data and configuration settings.
Therefore, it's crucial to diagnose issues using the full dataset. For example,
you can use Database LabEngine — a tool developed by Postgres.ai for
quickly creating thin clones: https://github.com/postgres-ai/database-lab-
engine
I/O typically provides a good indication of the amount of work needed to
execute a query, as most of the time is devoted to reading and processing
data pages. Although this metric isn't meaningful for the end user.
6
Application Profile
Subtasks
client-side
application server
database serverThe problem is often, but not always, right here.
network
How to Profile
It's technically challenging and requires various monitoring tools.
It's usually not difficult to confirm the assumption.
Response time is meaningful to the user. This means that, generally
speaking, the profile should encompass not just the DBMS, but also the
client side, the application server, and network data transfer.
Performance problems often stem from the DBMS, as a single inadequately
designed query plan can increase the time by orders of magnitude. But
that's not always true. The issue could be due to a slow client-server
connection, the client application taking too long to process the received
data, and other factors.
Unfortunately, obtaining such a comprehensive profile is quite challenging.
To achieve this, all components of the information system need to be
equipped with monitoring and tracing subsystems that take into account the
specific characteristics of this system. However, it's typically straightforward
to measure the overall response time (even with a stopwatch) and compare
it to the DBMS's total runtime; ensure there are no significant network
delays, among other factors. Without this, it's entirely possible we'll end up
searching where the light is better, not where the keys were actually lost.
The whole point of profiling is to identify what needs optimization.
We will proceed under the assumption that the problem lies specifically with
the DBMS.
7
PL/pgSQL Profiling
PL Profiler Extension
Third-party extension
Designed exclusively for PL/pgSQL functions
Profiling a separate script or session
Performance report in HTML format, including a flame graph
The plpgsql_check extension
Third-party extension
Validates PL/pgSQL and embedded SQL
Enables detection of compilation errors
Identifying dependent objects within functions
Support for automatic function profiling
When a user performs an action, multiple queries are typically executed.
How do you identify the query that's worth optimizing? This requires a profile
that breaks down to the query level.
However, the server-side component of the application, which runs within
the DBMS, isn't limited to SQL queries—it can also include procedural code.
If the code is written in PL/pgSQL, you can use the external PL Profiler
extension for profiling (primary developer: Yan Vik
The extension allows profiling of separately executed scripts and running
sessions. It also includes a call graph in the form of a "flame graph" (flame
graph).
You can also recommend the plpgsql_check extension (primary developer
— Pavel Stéchule: https://github.com/okbob/plpgsql_checkThis extension
validates SQL identifiers used in PL/pgSQL code, identifies performance
issues, and includes a built-in code profiler.
As the course focuses on SQL query optimization, we won't be covering
these extensions, but we mention them to provide a complete picture.
8
Query Statistics
Server message log
Enabled by configuration parameters:
log_min_duration_statement = 0— Duration and text of all queries
— Identifying details
Enabling it for a specific session can be challenging
large volume; increasing the time threshold leads to loss of information
Nested queries are not monitored(the auto_explain extension can be used)
analysis using external tools, such as pgBadger
PostgreSQL includes two primary built-in tools for profiling executed SQL
queries: the server log and statistics.
In the log file, you can enable the logging of query information and their
execution time using configuration parameters. The parameter
log_min_duration_statement is typically used for this, though others exist.
How can you identify the queries in the log that correspondto user actions? It
would be useful to enable or disable the parameter for a specific session,
but there are no built-in tools for this. You can filter the overall message
stream to isolate the relevant entries; this is conveniently achieved by
configuring the log_line_prefix parameter to include additional identifying
information. Connection pooling adds even more complexity to the situation.
To analyze nested queries—when a query invokes a function containing
additional queries—the auto_explain module is required
The next challenge is analyzing the log. This requires using external tools,
with pgBadger being the de facto standard
Of course, you can include your own messages in the log if they are helpful.
9
Query Statistics
The pg_stat_statements extension
Detailed query information in the view (including nested queries)
storage size is limited
Queries are considered the same "except for constants", even if they have
different execution plans
Authentication is limited to the username and database.
Unified Query Identifier (compute_query_id = auto)
The second approach involves using statistics, specifically the
pg_stat_statements extension
The extension gathers detailed information about executed queries
(including I/O page terms) and presents it in the pg_stat_statements view.
Since the number of distinct queries can be very high, the storage is
constrained by a configuration parameter, retaining only the most frequently
executed queries.
In this case, queries are considered identical if they have the same parse
tree (up to constants). Keep in mind that these queries may have different
execution plans and varying run times.
Unfortunately, there are challenges in identifying queries: they can be
associated with a specific user and database, but not with a session.
When the compute_query_id configuration parameter is set to auto or on,
the PostgreSQL core generates a unique query identifier. It can be logged
by setting the log_line_prefix parameter, and also used to combine data from
the kernel (the pg_stat_activity.query_id column), pg_stat_statements, and
other extensions.
11
Single Query Profile
EXPLAIN ANALYZE
subtasks correspond to plan nodes
Duration – actual time or page I/O – buffers
Execution count – loops
Features
Besides the most resource-intensive nodes, optimization candidates are those
with a large cardinality estimation error.
Any change may result in a complete overhaul of the execution plan
Sometimes you have to settle for a basic EXPLAIN
Either way, we identify the query to optimize from among the executed ones.
How to Work with the Query Itself? The EXPLAIN ANALYZE command also
provides a detailed execution profile.
The subtasks of this profile are the plan's nodes (the plan isn't a flat list but a
tree). The execution duration of a node and the number of its repetitions will
display the "time" and "loops" values from the actual section. Using the
buffers parameter, you can get the I/O volume (and the time spent on I/O
operations if the track_io_timin parameter is enabled).
The execution plan also contains critical information about the optimizer's
expected cardinality for each step. Typically, if there's no significant error in
the cardinality estimate, the plan will be appropriate (if not, you should adjust
the global settings). Therefore, you should focus not only on the most
resource-intensive nodes but also on those with a substantial (an order of
magnitude or more) difference between the estimated "rows" and actual
"rows". Examine the most deeply nested problematic node, as the error will
propagate upward through the tree from there.
There are cases where a query runs so long that EXPLAIN ANALYZE can't
be executed. In such cases, you'll have to make do with a basic EXPLAIN
and try to determine the cause of inefficiency without complete information.
Working with large execution plans in text format isn't always convenient.
For better clarity, consider using third-party tools, such as
13
Takeaways
Profiling is used to identify queries needing optimization.
The available profiling tools depend on the specific task
server Message Log FROM pg_stat_statements
EXPLAIN ANALYZE
14
Practice
1. Run the first version of the report displayed during the
demonstration, and ensure that query text and execution time are
recorded in the log file.
2. Check what information was recorded in the log file.
3. Repeat the previous steps by including the auto_explain
extension with nested query output.