The Bookstore App
Data Schema and API
17
Copyright
© Postgres Professional, 2017–2025
Authors Egor Rogov, Pavel Luzanov, Ilya Bashtanov, Igor Gnatyuk
Translated by Liudmila Mantrova and Alexander Meleshko
Photo by Oleg Bartunov (Tukuche peak, Nepal)
Use of course materials
Non-commercial use of course materials (presentations, demonstrations) is
allowed without restrictions. Commercial use is possible only with the written
permission of Postgres Professional. It is prohibited to make changes to the
course materials.
Feedback
Please send your feedback, comments and suggestions to:
edu@postgrespro.com
Disclaimer
In no event shall Postgres Professional company be liable for any damages
or loss, including loss of profits, that arise from direct or indirect, special or
incidental use of course materials. Postgres Professional company
specifically disclaims any warranties on course materials. Course materials
are provided “as is,” and Postgres Professional company has no obligations
to provide maintenance, support, updates, enhancements, or modifications.
2
Topics
Bookstore app overview
Data schema design, normalization
Data schema, the final version
Setting up the API
3
The application
The application consists of several parts, which are provided as separate
tabs.
“Store” is a customer UI for buying books online.
Other tabs represent the employee UI, which is available only to the
bookstore staff (the admin panel).
“Catalog” is the storekeeper’s UI which is used for ordering books to the
store and viewing arrivals and sales.
“Books” and “Authors” are the UI for librarians, where they can register
arrivals.
For training purposes, all this functionality is exposed in a single web page.
If any feature is unavailable because the required object (such as a table or
a function) is missing, the application will report an error. It also displays the
text of all queries sent to the server.
We will start with an empty database and will gradually implement all the
required components by the end of the course.
The source code of the application frontend will not be discussed in this
course, but is available for download:
https://pubgit.postgrespro.ru/pub/dev1app.git
5
Books
Book
title
authors
quantity
operations
entity
attributes
An ER model for high-level design
entities – concepts of the application domain
relationships – connections between entities
attributes – properties of entities and relationships
After taking a look at the application’s UI and functionality, we need to figure
out its data schema. We will not go into details about database design: it is a
separate branch of knowledge, which is beyond the scope of this course, but
we cannot ignore this topic entirely.
High-level database design often uses the ER model (“Entity–Relationship”).
It deals with entities (concepts of the subject area), their relationships, and
attributes (the properties of entities and relationships). The model allows us
to remain at the logical level, without getting down to data representation at
the physical level (such as its table form).
The first approach to database design is creating a diagram as shown on
this slide: a book is represented as a single big entity, and everything else
becomes its attributes.
6
Data schema
id | title | author | qty | operation
−−−+−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−+−−−−−+−−−−−−−−−−−
1 | The Tempest | William Shakespeare | 10 | +11
1 | THE TEMPEST | William Shakespeare | 10 | -1
2 | Romeo and Juliet | William Shakespeare | 4 | +4
3 | Good Omens | Terry Pratchett | 7 | +7
3 | Good Omens | Neil Gaiman | 7 | 0
Some data is duplicated
hard to maintain consistency
hard to perform updates
hard to write queries
10 = 11 – 1
7.0
or 0.7
or 7.7
?
Clearly, this approach cannot be correct. It may be not quite obvious in the
diagram itself, but let’s try to project this diagram onto database tables.
There are several ways to do it. One of them is shown on the slide: the table
corresponds to the entity, and table columns represent the attributes of this
entity.
This diagram is a good illustration that some data ends up duplicated (as
highlighted on the slide). Data duplication may lead to problems with
consistency, the very thing a database system is supposed to ensure.
For example, each of the two rows related to book 3 must list the total
quantity (7 items). How should purchases be recorded, then? Some rows
will need to be added to record the purchase operations. And then the
quantity in all the duplicated rows will need to be reduced from 7 to 6. But
what if an error leads to data discrepancy between these rows? How can we
define a constraint to make sure the values stay in sync?
Many queries will also become overcomplicated. How can we find the total
number of books? Or get a list of distinct authors?
Thus, such a schema will not work well for relational databases.
7
Books and operations
Normalization reduces data redundancy
Large entities are split into smaller ones
Book
title
authors
Operation
quantity change
date
one-to-many
relationship
To work with data in a relational database system properly, we need to
eliminate redundancy. This process is called normalization.
You might be familiar with various normal form concepts (first, second, third,
Boyce–Codd, etc.) We are not going to discuss them here; in essence, it is
enough to understand that all this math pursues one and the same goal:
eliminating redundancy.
The way to reduce redundancy is to split a larger entity into smaller ones.
How exactly to split it should be prompted by common sense (which cannot
be replaced by the knowledge of normal forms alone anyway).
In our case, everything is quite straightforward. Let’s start by separating
books and operations. These two entities are connected by a one-to-many
relationship: there can be several operations on each book, but each
operation relates to only one book.
8
Data schema
books
book_id | title | author
−−−−−−−−−+−−−−−−−−−−−−−−−−−−−+−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
1 | The Tempest | William Shakespeare
2 | Romeo and Juliet | William Shakespeare
3 | Good Omens | Terry Pratchett
3 | Good Omens | Neil Gaiman
operations
operation_id | book_id | qty_change | date_created
−−−−−−−−−−−−−−+−−−−−−−−−+−−−−−−−−−−−−+−−−−−−−−−−−−−−
1 | 1 | +10 | 2020-07-13
2 | 1 | -1 | 2020-08-25
3 | 3 | +7 | 2020-07-13
4 | 2 | +4 | 2020-07-13
At the physical level, this can be represented by two tables: books and
operations.
An operation changes the quantity of books (sell books if negative, order
new books if positive). Note that the book entity has no quantity attribute
anymore. Instead, you get the quantity by adding up all changes made by
operations related to this book. Having an additional quantity attribute would
only create data redundancy again.
This solution might raise some eyebrows at first. Is it really a good idea to
have to calculate the amount every time instead of having a separate field to
query? The answer is that we can simply create a view that shows the
current amount of books in store. This will not lead to redundancy, because
the view is just another query.
But what about performance? If summing up all changes brings too much
overhead, we can resort to denormalization: add the quantity field to the
books table and ensure its consistency with the operations table. Whether to
do this or not is beyond the scope of this course (it is discussed in the QPT
"Query Performance Tuning" course). Common sense suggests that it’s not
required for our bare-bones app, but we will get back to denormalization
when we get to the “Triggers” lecture.
Thus, moving all operations into a separate entity resolves most of the
duplication issues, but not all of them.
9
Books, authors, operations
Book
title
Author
first name
last name
middle name
Operation
quantity change
date
many-to-many
relationship
That’s why we have to go deeper: separate books from authors and tie them
by a many-to-many relationship: a book can be written by several authors,
and each author can have more than one book. At the table level, such
relationship can be implemented using an additional intermediate table.
The author’s attributes will be their first, last and middle name. It makes
sense because we may need to work with each of these attributes
separately, for example, to display the author’s last name and initials.
11
Designing the API
Tables and triggers
reading data directly from tables (views)
writing data directly to tables (views)
using triggers for changing related tables
the application must be aware of the data model
this approach provides maximum flexibility
hard to maintain consistency
Functions
reading data via table functions
writing data by calling functions
the application is separated from the data model and is limited by API
lots of wrapper functions required
potential performance issues
There are several ways to set up an API.
The first option is to allow the application to access and modify database
tables directly. In this case, the application must have the exact knowledge
of the data model. This requirement can be relaxed to some extent by using
views.
Another limitation of this approach is that the application has to follow certain
rules; otherwise, it is very hard to maintain data consistency if you have to
address all possible inappropriate operations at the database level. But this
is the approach that provides the most flexibility.
Another option is to forbid direct table access from the application and allow
only function calls. Reading data can be performed by table functions (which
return a set of rows). Writing can be performed by calling other functions and
passing the required data to them. In this case, all the necessary
consistency checks can be implemented within functions: the database will
be protected, but the application will be able to use only a limited set of
features that we provide. It requires writing many wrapper functions and can
lead to performance degradation.
You can also combine these two approaches. For example, you can allow
the application to read data from tables directly, but perform modifications
only by functions.
12
Customer API
buy_book
books
book_id
title
authorship
book_id
author_id
seq_num
authors
author_id
last_name
first_name
middle_name
operations
operation_id
book_id
qty_change
date_created
book purchase
get_catalog
list of
books
In this application, we will try different ways of setting up the interface
(although it’s usually better to stick to one approach when developing real
applications).
The store will use API functions:
get_catalog for looking up books (see “SQL. Composite Types”)
buy_book for making a purchase (see “PL/pgSQL. Query Execution”)
13
operations_v
catalog_v
authors_v
Employee API
add_book
books
book_id
title
authorship
book_id
author_id
seq_num
authors
author_id
last_name
first_name
middle_name
operations
operation_id
book_id
qty_change
date_created
add_author
UPDATE
adding a book adding an author
ordering a book
update_catalog_trigger
The admin panel is going to retrieve data by accessing the following views
(which we create as part of the practice for this lecture):
catalog_v for the list of books,
authors_v for the list of authors,
operations_v for the list of operations.
Authors will be added using the add_author function (we will create it once
we get to the “PL/pgSQL. Query Execution” lecture). For adding books, we
will implement the add_book function (“PL/pgSQL. Arrays”).
To enable book purchase, we will make the catalog_v view updatable
(“PL/pgSQL. Triggers”).
15
Takeaways
Database design is a separate complex topic
theory is important, but it should not take precedence over common sense
Normalized data simplifies the development and facilitates
consistency support
The API can use tables, views, functions, and triggers
16
Practice
1. Make sure that you are connected to the bookstore database and
the bookstore schema is the current one.
2. Create books, authors, authorship, and operations tables with all
the necessary constraints, exactly as shown in the demo.
3. Insert data about several books into the tables.
Check the result by running some queries.
4. In the bookstore schema, create authors_v, catalog_v, and
operations_v views, so that they look exactly like shown in the
demo.
Check that the application now shows the data in “Books”,
“Authors”, and “Catalog” tabs.
1. Use current_database() and current_schema() functions.
2. Use the demonstrated output of psql’s \d commands as a reference.
3. You can use the data shown in the demo, or come up with your own data.
4. Try writing queries to the base tables that return the same results as the
queries to views shown in the demo. Then save these queries as views.
After completing the assignments, make sure to compare your queries with
those in the provided keys. Make corrections if necessary.