Data organization
Databases and schemas
13
Copyright
© Postgres Professional, 2015–2022
Authors: Egor Rogov, Pavel Luzanov, Ilya Bashtanov
Translated by Alexander Meleshko
Use of course materials
Non-commercial use of course materials (presentations, demonstrations) is
allowed without restrictions. Commercial use is possible only with the written
permission of Postgres Professional. It is prohibited to make changes to the
course materials.
Feedback
Please send your feedback, comments and suggestions to:
Disclaimer
Postgres Professional assumes no responsibility for any damages and
losses, including loss of income, caused by direct or indirect, intentional or
accidental use of course materials. Postgres Professional company
specifically disclaims any warranties on course materials. Course materials
are provided “as is,” and Postgres Professional company has no obligations
to provide maintenance, support, updates, enhancements, or modifications.
2
Topics
Databases and templates
Schemas and search path
Special schemas, temporary objects
Managing databases, schemas, and objects
3
Database cluster
Cluster initialization creates three databases
A new database is always cloned from an existing one
postgres
table
table
object
template0
table
table
object
template1
table
table
object
new database
table
table
object
does not
change
connection
by default
common
changes
A PostgreSQL instance manages a database cluster that comprises multiple
databases. When a cluster is initialized, three identical databases are
created in a specific way. All other databases created by users are cloned
from one of the existing ones.
The template1 database is used by default when creating new databases.
Any objects and extensions added to the template will be copied into any
database created from it.
The template0 database must never be modified. It is required in at least
two scenarios. Firstly, it is necessary when restoring a database from a
backup created via pg_dump (since the copy will include not only the
database objects, but also any objects from template1). Secondly, it is
used when creating a new database with an encoding different from the one
specified during cluster initialization.
The postgres database is used to connect to by default by the postgres
user. It is not a hard requirement to have it, but some utilities expect it to be
there, so removing it isn’t a good idea, even if you never use the database
directly.
5
Schemas
Database object namespace
each object belongs to a schema
Purposes
dividing objects into logical groups
preventing name conflicts between applications
Schemas and users are different entities
Schemas are namespaces for database objects. They separate objects into
groups for easier management and serve to prevent name conflicts when
multiple users or applications access the same database.
Every object that exists in the database belongs to a schema.
In PostgreSQL, schema and user are different entities (although the default
settings allow users to seamlessly operate schemas of the same name).
6
Databases and schemas
appdb
pg_catalog public
postgres
pg_catalog public
table
table
object
table
table
object
table
table
object
table
table
object
app
table
table
object
schema
created by
a user
$ psql -d appdb
SELECT * FROM app.t;
system
catalog
“shared
schema
A cluster comprises multiple databases. Each database contains various
schemas across which database objects are distributed.
There are several standard schemas that exist in any database. More
schemas can be added by users.
Clients may connect to only one database at a time, but within the database,
the client is free to access objects from any schemas.
8
Search path
Determining an object’s schema
a schema is explicitly defined by a qualified name (schema.name)
a name without a qualifier is looked up among all schemas specified in the
search path
Search path
defined by the search_path parameter
non-existent schemas and schemas with restricted access are excluded;
implicit schemas are included
the real value is shown by the current_schemas function
the first explicitly specified schema in the path is where objects are created
When specifying an object, the schema it belongs to must be determined,
since different schemas may contain objects of the same name.
When the object name is qualified (with the schema name), the object is
looked up in the given schema, as shown on the previous slide. If the name
is used without a qualifier, PostgreSQL tries to look the name up in one of
the schemas listed in the search path, which is determined by the
search_path configuration parameter.
The actual search path may differ from the search_path parameter value:
any non-existent schemas are excluded, as well as schemas to which the
user does not have access (more on that in the Access control module).
In addition, some special schemas are implicitly added to the beginning of
the search path.
The actual search path, including implicit schemas, returns a call to the
current_schemas(true) function. The schemas are searched in the
order specified in the search path, from left to right. If the desired object
name is not found in the first schema, the next one is searched, and so on.
When an object is created with an unqualified name, it is placed into the first
explicitly defined schema in the path.
The concept of search_path is similar to the PATH variable in popular
operating systems.
9
Special schemas
The public schema
included in the search path by default
all objects will belong to this schema, unless configured otherwise
The schema named after the user
included in the search path by default but is not created automatically
if created, the user’s objects will belong to this schema
The pg_catalog schema
a schema for system catalog objects
if not explicitly included in the path, implicitly included as the first one
There are several special schemas usually present in any database.
The public schema is used by default for storing objects, unless intentionally
configured otherwise.
The pg_catalog schema stores system catalog objects. The system catalog
is a collection of tables containing metadata about objects belonging to the
cluster. information_schema is another schema with an alternative
representation of the system catalog (as defined in the SQL standard).
The pg_catalog schema contains system catalog objects (in particular, the
pg_* tables).
If pg_catalog is not specified in the search path, the schema will be implicitly
placed in the front of the path so that system objects remain visible (but
after pg_temp).
11
Special schemas
Temporary tables
exist for the duration of the session or transaction
not logged (no recovery after a crash)
do not utilize the shared buffer cache
pg_temp_N schema
created automatically for temporary tables
pg_temp: a link to a specific temporary schema for the session
if not explicitly included in the path, implicitly included as the very first one
at the end of the session, all objects of the temporary schema are dropped;
the schema itself remains and is reused for other sessions
PostgreSQL can work with temporary tables. Temporary tables store data
that should only be available to the current session and only for as long as
the session is active (or even for the duration of a single transaction within
the session).
Temporary tables are unlogged. This means that if a crash occurs, the table
contents will be lost. Additionally, temporary table pages are never stored in
the shared buffer cache, residing in the internal memory of their backend
process instead. Thanks to that, temporary tables can be accessed a bit
more quickly than regular ones.
Temporary tables are organized using schemas. When a session is started,
a temporary schema named pg_temp_N (pg_temp_1, pg_temp_2, etc.) is
created for it. It can be accessed by the name pg_temp (without a number,
because the name will always refer to the temporary schema specific for the
session).
If pg_temp is not specified in the path, it is searched before all others.
Otherwise, you can specify its position in the path explicitly, like with
pg_catalog.
After the session ends, all objects belonging to the temporary schema are
dropped, and the schema itself remains to be reused later on.
There are other special schemas, more technical in nature.
13
Takeaways
On the logical level,
a cluster contains databases,
a database contains schemas,
a schema contains specific objects (tables, indexes etc.)
Databases are created by cloning existing ones
A schema can be specified explicitly or determined using the
search path
Some schemas have specific purposes
14
Practice
1. Create a new database and connect to it.
2. Check the size of the created database.
3. Create two schemas. Name one app, and the other after your
user name.
Create several tables in both schemas and populate them with
some data.
4. Check how much the database size has increased.
5. Modify the search path value so that when connecting to the
database, tables from both schemas are accessible by an
unqualified name. The “username” schema should have priority.
15
Practice+
1. Create a database. For all sessions connecting to this database,
set the temp_buffers parameter to four times the default value.
1. Use the command ALTER DATABASE ... SET
More about the temp_buffers parameter: