SQL
Functions
12
Copyright
© Postgres Professional, 2017–2021
Authors: Egor Rogov, Pavel Luzanov
Translated by Liudmila Mantrova
Usage of Course Materials
Non-commercial use of course materials (presentations, demonstrations) is
allowed on an unrestricted basis. Commercial use is only possible with prior
written permission of Postgres Professional company. Modification of course
materials is forbidden.
Contact Us
Please send your feedback to: edu@postgrespro.ru
Disclaimer
In no event shall Postgres Professional company be liable for any damages
or loss, including loss of profits, that arise from direct or indirect, special or
incidental use of course materials. Postgres Professional company
specifically disclaims any warranties on course materials. Course materials
are provided “as is,” and Postgres Professional company has no obligations
to provide maintenance, support, updates, enhancements, or modifications.
2
Agenda
Functions and their specifics in the context of databases
Parameters and return values
Passing parameters when calling a function
Volatility categories and query planning
3
Functions in Databases
The main goal is simplifying development tasks
interface (parameters) and implementation (function body)
abstracting from other tasks when implementing a particular function
Traditional languages PostgreSQL
side effects global variables the whole database
(volatility categories)
modules have their own interface namespaces,
and implementation client and server
challenges function call overhead hiding the query
(inlining) from the planner
(inlining, subqueries,
views)
The main goal of introducing functions in programming is simplifying
development tasks by decomposing them into smaller subtasks. Such
simplification is possible because you can abstract from the “big” task when
thinking of a function. For this purpose, the function provides a precise
interface to the outside world (parameters and the return value). Its
implementation (the function body) can change; the caller does not see
these changes and does not depend on them. This ideal situation can be
messed up by the global state (global variables), and you have to keep in
mind that in the DB context the whole database constitutes such a state.
In traditional programming languages, functions are often grouped into
modules (packages, classes for OOP, etc.), which have their own interface
and implementation. This separation into modules can be more or less
arbitrary. In PostgreSQL, there is a fixed boundary between client and
server parts: the server code deals with the database, while the client code
manages transactions. There are no modules (or packages), only
namespaces are present.
The only disadvantage of extensive use of functions in traditional languages
is function call overhead. It is sometimes overcome by inlining function code
into the calling program. In databases, the consequences can be more
serious: if some part of the query is moved into a function, the planner stops
seeing the “big picture” and cannot build a good query plan. In some cases,
PostgreSQL can also perform inlining; alternatively, subqueries or views can
be used.
4
General Information
A database object
function declaration is stored in the system catalog
The structure of function declaration
name
parameters
return data type
body
Can be written in various languages, including SQL
the code is stored as a string literal
a function is interpreted when it is called
Is called in the context of an expression
Functions are regular database objects, just like tables or indexes. Function
declarations are stored in the system catalog; that’s why database functions
are called stored functions.
PostgreSQL provides a lot of standard functions. Some of them are listed in
the “Basic Data Types and Functions” handout.
You can also write your own functions in various programming languages.
The information provided in this lecture applies to functions in any
programming language, but we will use SQL in all examples.
Predictably, a function declaration consists of a name, optional parameters,
a return data type, and a body. What may seem unexpected is that the body
is written as a string literal, which contains the code written in the
programming language of your choice. It makes function declarations look
the same regardless of the used programming language. The body string is
stored in the system catalog and is interpreted each time the function is
called. At the moment, the only way to avoid interpretation is to write a
function in the C language, but it’s required not too often, so we are not
going to discuss it in this course.
A function is always called within the context of an expression. For example,
in the list of expressions of the SELECT statement, in the WHERE clause, in
CHECK constraints, etc.
6
Input and Output
Input values
are defined by parameters with IN or INOUT modes
Output value
is defined either by the RETURNS clause
or by parameters with INOUT or OUT modes
if both forms are specified, they must not contradict each other
Formal parameters that have IN or INOUT modes are considered to be
input parameters. Their actual values must be specified in the function call
(or the default values must be defined).
There are two ways to define the return value:
- use the RETURNS clause to specify the return data type
- define output parameters using INOUT or OUT modes
These two approaches are equivalent. For example, a function with the
RETURNS integer clause and a function with the OUT integer parameter both
return an integer number.
You can combine these two approaches. In this case, the function will also
return one integer number. But note that the types of the output parameters
and the RETURNS clause must not contradict each other.
Thus, you cannot write a function that returns one value, but passes
another value into the OUT parameter (which is allowed in many traditional
programming languages). In PostgreSQL, such functions return both
values.
8
Volatility Categories
Volatile
returns different values for the same input arguments
is used by default
Stable
the value cannot change within a single SQL operator
the function cannot change the database state
Immutable
the value cannot change, the function is deterministic
the function cannot change the database state
Each function is mapped to a particular volatility category, which defines the
properties of the return value for the same input arguments.
The volatile category means that the return value can change randomly.
Such functions will be executed each time they are called. If the function is
declared without a category specification, it is assumed to be volatile.
The stable category is used for functions that always return the same value
within a single SQL operator. In particular, such functions cannot change
the state of the database. It is possible to execute such a function only once
during the query and then use the computed value.
The immutable category is even more strict: the return value always
remains the same. Such a function can be executed at the planning stage,
before the query is actually executed.
It does not mean that it happens so all the time, but the planner has the
right to perform such optimizations. In some (simple) cases, the planner
makes its own assumptions about function volatility, regardless of the
explicitly provided category.
10
Summary
You can create your own function and use it
in the same manner as the built-in ones
Functions can be written in various languages, including SQL
Volatility categories affect optimization opportunities
An SQL function can sometimes be inlined
11
Practice
1. Create the author_name function to construct author names.
The function takes three parameters (last_name, first_name,
and middle_name) and returns the full name, with the middle
name abbreviated to its initial.
Use this function in the authors_v view.
2. Create the book_name function to construct book titles.
The function takes two parameters (book ID and the title) and
returns a concatenation of the book title and the list of authors,
in the seq_num order. The name of each author is produced by
the author_name function.
Use this function in the catalog_v view.
Check the changes in the application.
Just a friendly reminder: all the required functions are listed in the “Basic
Data Types and Functions” handout.
Task 1. FUNCTION author_name(
first_name text, middle_name text, last_name text
)
RETURNS text
For example: author_name('Alexander','Sergeyevich', 'Pushkin')
→ 'Alexander S. Pushkin'
Task 2. FUNCTION book_name(book_id integer, title text)
RETURNS text
For example: book_name(3,'Good Omens') →
→ 'Good Omens. Terry Pratchett, Neil Gaiman'
Stored functions can be edited directly in any tool. For example, psql
provides the \ef command that opens the function body in an editor and
saves the changes in the database.
You should avoid using this capability (or at least do not overuse it).
A properly set up development process requires that all the code is stored
in files under version control. If a function has to be changed, the file is
modified and executed (using psql or an IDE). Function modifications made
directly in the database can be easily lost. (In fact, setting up development
processes is much more complex, but we are not going to cover it in this
course.)
12
Practice
1. Write a function that returns random timestamps,
equally distributed within the specified time range.
The lower bound of the range is set by a timestamptz, while
the upper bound is either a timestamptz or an interval.
2. Sometimes phonewords are used as a mnemonic equivalent of
telephone numbers, like 1-800-TAXICAB. A phoneword is
constructed by replacing some digits with the corresponding
letters, as seen on the keypad (even on modern phones).
Write a function that returns number representation of a
phoneword, no dashes.
3. Write a function that finds square roots of an equation.
In all tasks, make sure to pay attention to volatility categories of the
functions.
Task 2. The correspondence between digits and letters is as follows:
2 – abc, 3 – def, 4 – ghi, 5 – jkl, 6 – mno, 7 – pqrs, 8 – tuv, 9 – wxyz
Note that a phoneword can be written in lowercase as well as in uppercase.
Task 3. For the equation y = ax
2
+ bx + c:
discriminant D = b
2
– 4ac.
if D > 0, then there are two roots x
1,2
= (–b ± √D) / 2a
if D = 0, then there is one root x = –b / 2a (null can be returned as x
2
)
if D < 0, then there are no roots (both roots are null).