Data Organization
Logical Structure
15
Copyright
© Postgres Professional, 2023
Authors: Egor Rogov, Pavel Luzanov, Ilya Bashtanov
Translated by Liudmila Mantrova
Cover photo by Oleg Bartunov (Phu monastery and Bhrikuti peak, Nepal)
Use of course materials
Non-commercial use of course materials (presentations, demonstrations) is
allowed without restrictions. Commercial use is possible only with the written
permission of Postgres Professional. It is prohibited to make changes to the
course materials.
Feedback
Please send your feedback, comments and suggestions to:
Disclaimer
Postgres Professional assumes no responsibility for any damages and
losses, including loss of income, caused by direct or indirect, intentional or
accidental use of course materials. Postgres Professional company
specifically disclaims any warranties on course materials. Course materials
are provided “as is,” and Postgres Professional company has no obligations
to provide maintenance, support, updates, enhancements, or modifications.
2
Agenda
Databases and templates
Schemas and search path
Special schemas
System catalog
3
a new DBtemplate1template0
postgres
Database Cluster
Cluster initialization creates three databases
A new database is always cloned from an existing one
таблица
таблица
object
таблица
таблица
object
таблица
таблица
object
таблица
таблица
object
no changes
default
connection
common
changes
CREATE DATABASE
A PostgreSQL instance manages several databases, which constitute a
database cluster. During cluster initialization (which is performed either
automatically during installation or manually using the initdb command),
three identical databases are created. All other databases created by user
are cloned from an already existing database.
By default, the template1 DB is used as the source for creating new
databases. You can extend it with additional objects and modules that will
be copied into each new database.
The template0 database must never be modified. This template is
required at least in two situations. First, it is used to restore the database
from a backup copy created by the pg_dump utility (as described in the
“Backup Overview” lecture). Second, it is required when creating a new
database with a non-default collation (we discuss it in more detail in the
DBA2 course).
The postgres database is used to establish a connection on behalf of the
postgres user by default. Keeping this database is not mandatory, but some
utilities rely on its existence, so it’s not recommended to delete this
database even if you do not need it.
5
Schemas
Namespaces for objects
classify objects into logical groups
prevent name conflicts between applications
Schemas and users are different entities
Special schemas
public — all objects are created here by default
pg_catalogsystem catalog
information_schema — an alternative view of the system catalog
pg_temp — a storage for temporary tables
etc.
Schemas are virtually namespaces for database objects. They enable
classifying objects into logical groups for easier management as well as
prevent name conflicts when serving several users or applications.
In PostgreSQL, schemas and users are different entities (even though the
default settings are well-suited for using the schema with the same name as
the current user).
There are several special schemas that are usually present in each
database.
By default, database objects are kept in the public schema unless another
location is specified.
The pg_catalog schema stores objects of the system catalog. The system
catalog comprises metadata of cluster objects, which is stored in tables
within the cluster itself. An alternative view of the system catalog (defined in
the SQL standard) is provided by information_schema.
The pg_temp schema stores temporary tables. (In fact, temporary tables
are created in schemas called pg_temp_1, pg_temp_2, and so on: each
user has its own schema, but they are all referred to as pg_temp.)
There are also some other schemas, but their purpose is merely technical.
6
template1postgres
Databases and Schemas
pg_catalog
таблица
таблица
object
public
таблица
таблица
object
pg_catalog public
таблица
таблица
object
таблица
таблица
object
таблица
таблица
object
таблица
таблица
object
таблица
таблица
object
таблица
таблица
object
таблица
таблица
object
system
catalog
cluster-wide
objects
a schema
таблица
таблица
object
таблица
таблица
object
Schemas belong to databases, and all DB objects belong to this or that
schema.
However, several system catalog tables store the objects that are common
to the whole cluster. They contain the list of databases, the list of users, and
some other information. These tables do not belong to any database, but
they are equally visible in all databases.
Thus, a client connected to a database can see the descriptions of the
objects that belong to this database as well as cluster-wide objects. Object
descriptions for other databases are only available if you connect to these
databases.
8
Search Path
Determining the objects schema
a qualified name (schema.name) explicitly defines the schema
an unqualified name is looked up in schemas listed in search path
Search path
is defined by the search_path parameter,
the actual path is displayed by the current_schemas function
excludes non-existent and inaccessible schemas
pg_temp and pg_catalog schemas are implicitly included first
unless they are already specified in the search_path parameter
the first explicitly specified schema is used for object creation
Different schemas can contain objects with the same name, so when
specifying an object, its necessary to identify its schema.
If the object has a qualified name, it’s easy: the explicitly specified schema
is used. Otherwise, PostgreSQL tries to find the object name in one of the
schemas listed in the search path, which is defined in the search_path
configuration parameter.
The actual search path can differ from the search_path parameter value.
It excludes non-existent schemas listed in search_path as well as those
schemas that the current user cannot access (we will cover access control
in one of the next lectures of this course). Besides, the following schemas
are implicitly added to the beginning of the search path:
the pg_catalog schema to ensure that the system catalog is always
accessible
the pg_temp schema if the user has created temporary objects
You can view the actual search path, including the implicitly added schemas,
by calling the current_schemas(true) function. Schemas are looked up
as they follow in the search path, from left to right. If the object with the
specified name is not found in the schema, the search continues in the next
one.
If you create an object with an unqualified name, it will get into the first
schema that is explicitly specified in the search path.
We can say that search_path is somewhat analogous to the PATH variable
in operating systems.
10
System Catalog
Description of all cluster objects
the list of tables in each database (the pg_catalog schema)
and several cluster-wide objects
several views provided for convenience
Access
SQL queries, special psql commands
Conventions
table names start with pg_
column names contain a three-letter prefix
the oid column of type oid is used as the primary key
object names are lowercase
The system catalog stores metadata of cluster objects. In each database,
you can find a separate set of tables that describe the objects of this
particular database, as well as several tables common to the whole cluster.
There are also a number of views provided for convenience.
You can access the system catalog using regular SQL queries. The psql
client also offers a whole range of commands that provide convenient ways
to view it. Catalog tables should not be modified directly; they get updated
automatically as you run DDL commands.
The names of all system catalog tables start with pg_, for example,
pg_database. Column names usually start with a prefix that corresponds
to the table name, for example, datname. Object names are lowercase,
for example, 'postgres'.
In most system catalog tables the oid column serves as a primary key.
This column is of a special oid type, which means object identifier (a 32-bit
integer).
12
Takeaways
At the logical level
a cluster contains databases
databases contain schemas
schemas contain actual objects (tables, indexes, etc.)
New databases are created by cloning existing ones
Object schemas are determined by the search path
The full description of the database cluster contents is stored
in the system catalog
13
Practice
1. In an empty database, create a schema that has the same name
as the current user. Create the app schema.
Create several tables in both schemas.
2. In psql, display the description of the created schemas and the
list of tables that belong to them.
3. Set the search path to enable access to the tables in both schemas
by an unqualified name; the “user” schema must have the
priority.
Check the result.