Basic tools
Server installation and management
13
Copyright
© Postgres Professional, 2015–2022
Authors: Egor Rogov, Pavel Luzanov, Ilya Bashtanov
Translated by Alexander Meleshko
Use of course materials
Non-commercial use of course materials (presentations, demonstrations) is
allowed without restrictions. Commercial use is possible only with the written
permission of Postgres Professional. It is prohibited to make changes to the
course materials.
Feedback
Please send your feedback, comments and suggestions to:
Disclaimer
Postgres Professional assumes no responsibility for any damages and
losses, including loss of income, caused by direct or indirect, intentional or
accidental use of course materials. Postgres Professional does not provide
any guarantees for the course materials. Course materials are provided "as
is" and Postgres Professional is not obligated to provide maintenance,
support, updates, extensions and changes.
2
Topics
Basic concepts
Installation from source code
Server management
Package installation
Server management in Ubuntu
Cloud solutions
3
Database cluster and server
PostgreSQL
database
database
cluster
Before talking about installation, let’s review some basic concepts.
PostgreSQL is a program that belongs to the class of database
management systems.
When this program is running, we call it a PostgreSQL server or a
serverinstance. So far, the server seems to be a “black box” for us, but
gradually we will get acquainted with how it works.
The data managed by PostgreSQL is stored in databases. One instance of
PostgreSQL simultaneously manages several databases. This set of
databases is called a database cluster. We will talk more about databases
in the “Data organization. Databases and schemas” module of the course.
So, a database cluster is data in files. A server or a server instance is
aCprogram that manages a database cluster.
4
Source code
Installation from source code
stable server version
you can use non-standard parameters or build the server for a non-standard
architecture
Installation from git
current server version
used primarily by kernel developers; requires a broader set of tools
You should at least have an idea of how to install the program from source
code, so that you could build PostgreSQL with non-standard parameters or
on a non-standard architecture, if necessary.
You can find the version 13.6 source code in gzip and bzip2 at
postgres user home catalog.
You can also get the source code straight from the project’s git repository:
git://git.postgresql.org/git/postgresql.git (there are other mirrors, including
github).
This way, you can build not just the stable version, but also a version at any
specific commit, including the most recent one.
When installing from the git repository, a wider set of tools is required. For
example, the lexer and parser are made using Flex and Bison, and git
stores the source codes for these tools. They generate C files when built,
which are then compiled. The source code archives include the pre-built C
files already.
5
Required software
Hard requirements
tar, gzip/bzip2, GNU make, C compiler (C89)
Used, but can be disabled
GNU Readline library, zlib library
Extras
Perl, Python, and Tcl programming languages
for using PL/Perl, PL/Python, PL/Tcl
Kerberos, OpenSSL, OpenLDAP, PAM — authentication and encryption
tools
ICU library for cross-platform UNICODE support
Separate tools when building from the git repository
A number of programs and utilities are required when building from source.
The readline library allows you to edit the command line, use the command
history and auto-completion. In a server installation, it may not be necessary
if you are never going to run psql via the console.
The zlib library is used to compress pg_dump archives.
7
Creating a cluster
Tool
initdb
Notes
a new database cluster cannot belong to the OS superuser
PGDATA is a convenient variable that represents the cluster directory
configuration files are created in the PGDATA directory
enabling checksum calculation for data pages is a good idea
After installing the server, you need to create a database cluster. The initdb
tool does that for you.
For security reasons, the directory in which the cluster is initialized cannot
belong to the OS superuser. Usually, the cluster ownership is assigned to
the postgres user.
The cluster owner can define the PGDATA environment variable pointing to
the cluster directory. This variable is used by some server tools when they
need to find out the location of the cluster. Such tools include initdb, as well
as pg_ctl, the main server management tool, which we will discuss in a bit.
During cluster initialization, initdb creates configuration files in the PGDATA
directory. We will talk more about configuration files in another topic.
initdb has many keys that affect its operation. One important key is -k or
--data-checksums. It enables or disables calculating checksums for data
pages. The checksum check is performed when accessing any data page in
the cluster. This slightly reduces performance, but allows you to quickly
detect data corruption.
9
Server management
Tool
pg_ctl
Main operations
starting, stopping and checking the status of the server
updating configuration parameters
switching to a replica
The main server management operations include starting and stopping the
server, getting the current status of the server, updating the configuration,
and some others.
pg_ctl, a stock PostgreSQL tool, is designed to perform these actions.
pg_ctl should be run as the owner of the database cluster.
For more details about server management for database administrators,
see:
11
Installation from a package
Package installation is usually preferable
Linux ( , Debian, Red Hat, CentOS and others)
comes pre-installed with the OS
repository (yum, apt) or RPM/DEB packages
FreeBSD, OpenBSD
packages from the Ports and Packages Collection
macOS
Windows
The preferred option is to use ready-made packages, since in this case a
clear, supported and easily updated installation is obtained.
Packages exist for most popular systems. Each of them may have its own
features that you should get to know before installing.
Package repositories of some systems already include PostgreSQL, but
usually not the latest version. The latest version is always available in the
PostgreSQL Global Development Group (PGDG) repository:
We will be working with the Ubuntu PostgreSQL package.
12
Creating a cluster
Tool
Notes
initialization is performed when installing the package, creates
a cluster named “main”
checksum calculation is not enabled by default
defines the location of server configuration and log files
sets up automatic server startup at OS startup
pg_createcluster
initdb
pg_createcluster is a wrapper for the initdb tool designed to initialize the
cluster in Ubuntu.
The help for the pg_createcluster command can be obtained using man:
$ man pg_createcluster
pg_createcluster executes automatically when the package is installed and
creates a database cluster named “main”.
Note that cluster initialization is done with checksum calculation for data
pages disabled.
Executables, configuration files and the server log are stored in accordance
with the Ubuntu practices.
In addition, the installation sets up the PostgreSQL server to start and stop
when Ubuntu stats and stops.
To delete a cluster, the pg_dropcluster tool is used.
Both pg_createcluster and pg_dropcluster are specific to Ubuntu.
In other systems, you need to explicitly initialize the cluster using initdb and
use the appropriate operating system tools.
14
Server management
Tool
Main operations
starting, stopping and checking the status of the server
reloading configuration parameters
switching to a replica
pg_ctlcluster
pg_ctl
In the Ubuntu PostgreSQL package, pg_ctl is not run directly, but through a
special wrapper pg_ctlcluster. The help for pg_ctlcluster can be obtained
using man:
$ man pg_ctlcluster
In other systems, pg_ctl may be used directly.
16
Cloud solutions
PostgreSQL in a virtual environment
virtualization overhead
the administrator has full access to the instance
PostgreSQL as a Service
offered by many cloud providers
the provider takes over the administration routine
limited management, backup and monitoring tools
Virtualization solutions can be used to run databases, including
PostgreSQL. The trade off for the convenience of this approach is
considerable overhead costs. For high-load systems, any additional
intermediate layers between the DBMS and the hardware are undesirable.
In addition, many leading cloud providers offer PostgreSQL as a service
(Database as a Service, managed database).
In this case, the provider takes over most of the administration tasks. This
limits your control over the instance. Moreover, management capabilities for
tasks such as performance monitoring or backup and recovery are limited
by the tools offered by the service provider.
We will not consider the features of individual cloud solutions within the
course. You can study the documentation offered by different service
providers to learn more.
17
Takeaways
Two installation options — package or source code
Cloud solutions available
The server is run by a dedicated OS user
Initialize a database cluster before using the server
OS-specific commands for server management
18
Practice
Enabling checksum calculation for a cluster.
1. Stop the server.
2. Check whether checksums are being calculated for the cluster.
3. Enable checksum calculation.
4. Start the server.
2, 3. Use the pg_checksums tool. To run it, enter the full path:
/usr/lib/postgresql/13/bin/pg_checksums
19
Practice+
1. Install PostgreSQL from source code as shown in the demo.
2. Create a database cluster, start the server.
3. Make sure the server is running.
4. Stop the server.