13
Data schema
Normalization - removal of data redundancy
Simplifies queries and consistency checks
Denormalization involves introducing redundancy
may enhance performance, but requires synchronization
indexes
precomputed fields (such as generated columns or triggers)
materialized views
Caching application results
At the logical level, a database should be normalized—informally, this
means eliminating redundancy in the data. If that's not the case, we're
dealing with a design flaw: data consistency checks will be challenging, and
various anomalies can occur during data changes, and so on.
While data duplication at the storage level can significantly improve
performance, this comes at the cost of maintaining synchronization between
redundant data and primary data.
The most common method of denormalization is indexes—though they're
typically not considered in this context. Indexes are automatically updated.
You can duplicate some data (or calculated results based on this data) in
table columns. You can use generated columns or keep the data in sync with
triggers. Regardless of the approach, the database is responsible for
denormalization.
Another example is materialized views. They must also be updated, for
example, on a schedule or through other methods. This topic was thoroughly
covered in the "Materialization" section.
Data can also be duplicated at the application level by caching query results.
This is a common approach, but it's often used due to the application's
improper interaction with the database (e.g., when using ORM). The
application is responsible for timely cache updates, ensuring access controls
for cache data, and other related tasks.