SQL

Functions

Authors: Egor Rogov, Pavel Luzanov

Translated by Liudmila Mantrova

Usage of Course Materials

Non-commercial use of course materials (presentations, demonstrations) is

allowed on an unrestricted basis. Commercial use is only possible with prior

written permission of Postgres Professional company. Modification of course

materials is forbidden.

Please send your feedback to: edu@postgrespro.ru

Disclaimer

In no event shall Postgres Professional company be liable for any damages

or loss, including loss of profits, that arise from direct or indirect, special or

incidental use of course materials. Postgres Professional company

specifically disclaims any warranties on course materials. Course materials

are provided “as is,” and Postgres Professional company has no obligations

to provide maintenance, support, updates, enhancements, or modifications.

Agenda

Functions and their specifics in the context of databases

Parameters and return values

Passing parameters when calling a function

Volatility categories and query planning

Functions in Databases

The main goal is simplifying development tasks

interface (parameters) and implementation (function body)

abstracting from other tasks when implementing a particular function

Traditional languages PostgreSQL

side effects global variables the whole database

(volatility categories)

modules have their own interface namespaces,

and implementation client and server

challenges function call overhead hiding the query

(inlining) from the planner

(inlining, subqueries,

views)

The main goal of introducing functions in programming is simplifying

development tasks by decomposing them into smaller subtasks. Such

simplification is possible because you can abstract from the “big” task when

thinking of a function. For this purpose, the function provides a precise

interface to the outside world (parameters and the return value). Its

implementation (the function body) can change; the caller does not see

these changes and does not depend on them. This ideal situation can be

messed up by the global state (global variables), and you have to keep in

mind that in the DB context the whole database constitutes such a state.

In traditional programming languages, functions are often grouped into

modules (packages, classes for OOP, etc.), which have their own interface

and implementation. This separation into modules can be more or less

arbitrary. In PostgreSQL, there is a fixed boundary between client and

server parts: the server code deals with the database, while the client code

manages transactions. There are no modules (or packages), only

namespaces are present.

The only disadvantage of extensive use of functions in traditional languages

is function call overhead. It is sometimes overcome by inlining function code

into the calling program. In databases, the consequences can be more

serious: if some part of the query is moved into a function, the planner stops

seeing the “big picture” and cannot build a good query plan. In some cases,

PostgreSQL can also perform inlining; alternatively, subqueries or views can

be used.

General Information

A database object

function declaration is stored in the system catalog

The structure of function declaration

name

parameters

return data type

body

Can be written in various languages, including SQL

the code is stored as a string literal

a function is interpreted when it is called

Is called in the context of an expression

Functions are regular database objects, just like tables or indexes. Function

declarations are stored in the system catalog; that’s why database functions

are called stored functions.

PostgreSQL provides a lot of standard functions. Some of them are listed in

the “Basic Data Types and Functions” handout.

You can also write your own functions in various programming languages.

The information provided in this lecture applies to functions in any

programming language, but we will use SQL in all examples.

Predictably, a function declaration consists of a name, optional parameters,

a return data type, and a body. What may seem unexpected is that the body

is written as a string literal, which contains the code written in the

programming language of your choice. It makes function declarations look

the same regardless of the used programming language. The body string is

stored in the system catalog and is interpreted each time the function is

called. At the moment, the only way to avoid interpretation is to write a

function in the C language, but it’s required not too often, so we are not

going to discuss it in this course.

A function is always called within the context of an expression. For example,

in the list of expressions of the SELECT statement, in the WHERE clause, in

CHECK constraints, etc.

https://postgrespro.com/docs/postgresql/12/sql-createfunction

https://postgrespro.com/docs/postgresql/12/sql-syntax-calling-funcs

Input and Output

Input values

are defined by parameters with IN or INOUT modes

Output value

is defined either by the RETURNS clause

or by parameters with INOUT or OUT modes

if both forms are specified, they must not contradict each other

Formal parameters that have IN or INOUT modes are considered to be

input parameters. Their actual values must be specified in the function call

(or the default values must be defined).

There are two ways to define the return value:

- use the RETURNS clause to specify the return data type

- define output parameters using INOUT or OUT modes

These two approaches are equivalent. For example, a function with the

RETURNS integer clause and a function with the OUT integer parameter both

return an integer number.

You can combine these two approaches. In this case, the function will also

return one integer number. But note that the types of the output parameters

and the RETURNS clause must not contradict each other.

Thus, you cannot write a function that returns one value, but passes

another value into the OUT parameter (which is allowed in many traditional

programming languages). In PostgreSQL, such functions return both

values.

Volatility Categories

Volatile

returns different values for the same input arguments

is used by default

Stable

the value cannot change within a single SQL operator

the function cannot change the database state

Immutable

the value cannot change, the function is deterministic

the function cannot change the database state

Each function is mapped to a particular volatility category, which defines the

properties of the return value for the same input arguments.

The volatile category means that the return value can change randomly.

Such functions will be executed each time they are called. If the function is

declared without a category specification, it is assumed to be volatile.

The stable category is used for functions that always return the same value

within a single SQL operator. In particular, such functions cannot change

the state of the database. It is possible to execute such a function only once

during the query and then use the computed value.

The immutable category is even more strict: the return value always

remains the same. Such a function can be executed at the planning stage,

before the query is actually executed.

It does not mean that it happens so all the time, but the planner has the

right to perform such optimizations. In some (simple) cases, the planner

makes its own assumptions about function volatility, regardless of the

explicitly provided category.

https://postgrespro.com/docs/postgresql/12/xfunc-volatility

Summary

You can create your own function and use it

in the same manner as the built-in ones

Functions can be written in various languages, including SQL

Volatility categories affect optimization opportunities

An SQL function can sometimes be inlined

Practice

1. Create the author_name function to construct author names.

The function takes three parameters (last_name, first_name,

and middle_name) and returns the full name, with the middle

name abbreviated to its initial.

Use this function in the authors_v view.

2. Create the book_name function to construct book titles.

The function takes two parameters (book ID and the title) and

returns a concatenation of the book title and the list of authors,

in the seq_num order. The name of each author is produced by

the author_name function.

Use this function in the catalog_v view.

Check the changes in the application.

Just a friendly reminder: all the required functions are listed in the “Basic

Data Types and Functions” handout.

Task 1. FUNCTION author_name(

first_name text, middle_name text, last_name text

)

RETURNS text

For example: author_name('Alexander','Sergeyevich', 'Pushkin')

→ 'Alexander S. Pushkin'

Task 2. FUNCTION book_name(book_id integer, title text)

RETURNS text

For example: book_name(3,'Good Omens') →

→ 'Good Omens. Terry Pratchett, Neil Gaiman'

Stored functions can be edited directly in any tool. For example, psql

provides the \ef command that opens the function body in an editor and

saves the changes in the database.

You should avoid using this capability (or at least do not overuse it).

A properly set up development process requires that all the code is stored

in files under version control. If a function has to be changed, the file is

modified and executed (using psql or an IDE). Function modifications made

directly in the database can be easily lost. (In fact, setting up development

processes is much more complex, but we are not going to cover it in this

course.)

Practice

1. Write a function that returns random timestamps,

equally distributed within the specified time range.

The lower bound of the range is set by a timestamptz, while

the upper bound is either a timestamptz or an interval.

2. Sometimes phonewords are used as a mnemonic equivalent of

telephone numbers, like 1-800-TAXICAB. A phoneword is

constructed by replacing some digits with the corresponding

letters, as seen on the keypad (even on modern phones).

Write a function that returns number representation of a

phoneword, no dashes.

3. Write a function that finds square roots of an equation.

In all tasks, make sure to pay attention to volatility categories of the

functions.

Task 2. The correspondence between digits and letters is as follows:

2 – abc, 3 – def, 4 – ghi, 5 – jkl, 6 – mno, 7 – pqrs, 8 – tuv, 9 – wxyz

Note that a phoneword can be written in lowercase as well as in uppercase.

Task 3. For the equation y = ax

+ bx + c:

discriminant D = b

– 4ac.

if D > 0, then there are two roots x

1,2

= (–b ± √D) / 2a

if D = 0, then there is one root x = –b / 2a (null can be returned as x

)

if D < 0, then there are no roots (both roots are null).