Estimating bloat of tables and indexes

Estimation of bloat is required to make a decision whether it is time for rebuilding. There are different ways for it:

α=> CREATE EXTENSION pgstattuple;
CREATE EXTENSION

Let's create a table and fill it with some data:

α=> CREATE TABLE bloat(id serial, s text);
CREATE TABLE
α=> INSERT INTO bloat(s)
  SELECT g.id::text FROM generate_series(1,100000) AS g(id);
INSERT 0 100000
α=> CREATE INDEX ON bloat(s);
CREATE INDEX

We can use the extension to check the table's conditions:

α=> SELECT * FROM pgstattuple('bloat') \gx
-[ RECORD 1 ]------+--------
table_len          | 4014080
tuple_count        | 100000
tuple_len          | 3388895
tuple_percent      | 84.43
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 4356
free_percent       | 0.11


And the index:

α=> SELECT * FROM pgstatindex('bloat_s_idx') \gx
-[ RECORD 1 ]------+--------
version            | 3
tree_level         | 1
index_size         | 2252800
root_block_no      | 3
internal_pages     | 1
leaf_pages         | 273
empty_pages        | 0
deleted_pages      | 0
avg_leaf_density   | 89.98
leaf_fragmentation | 0


Now let's update half the rows:

α=> UPDATE bloat SET s = s || '!' WHERE id % 2 = 0;
UPDATE 50000

Looking at the table again:

α=> SELECT * FROM pgstattuple('bloat') \gx
-[ RECORD 1 ]------+--------
table_len          | 6021120
tuple_count        | 100000
tuple_len          | 3438895
tuple_percent      | 57.11
dead_tuple_count   | 50000
dead_tuple_len     | 1694450
dead_tuple_percent | 28.14
free_space         | 4732
free_percent       | 0.08

Density decreases.


By default the pgstattuple extension reads the whole table, although we can request an estimation, which is faster:

α=> SELECT * FROM pgstattuple_approx('bloat') \gx
-[ RECORD 1 ]--------+-------------------
table_len            | 6021120
scanned_percent      | 100
approx_tuple_count   | 100000
approx_tuple_len     | 3438895
approx_tuple_percent | 57.113875823767
dead_tuple_count     | 50000
dead_tuple_len       | 1694450
dead_tuple_percent   | 28.1417742878401
approx_free_space    | 4732
approx_free_percent  | 0.0785900297619048


And the index:

α=> SELECT * FROM pgstatindex('bloat_s_idx') \gx
-[ RECORD 1 ]------+--------
version            | 3
tree_level         | 2
index_size         | 4505600
root_block_no      | 412
internal_pages     | 3
leaf_pages         | 546
empty_pages        | 0
deleted_pages      | 0
avg_leaf_density   | 67.6
leaf_fragmentation | 49.82

Density decreases also.


The end of demonstration.