Aller au contenu

Bloat

Le bloat est l’espace perdu dans une table, notamment si le VACUUM ne nettoie pas les « lignes mortes » suffisamment souvent. Le terme français fragmentation est aussi utilisé pour définir le bloat. Elles ne seront alors pas réutilisées. Un VACUUM libère l’espace utilisé par ces lignes mortes qui est alors réutilisable par les insertions et mises à jour suivantes.

Un bloat important mène à un gaspillage d’espace disque, à des parcours complets de table plus longs. Un peu de bloat dans une table ou un index n’a rien d’inquiétant, il fait simplement partie de la vie de la table. Un bloat très important peut nécessiter la reconstruction de la table par un VACUUM FULL.

Méthode rigoureuse

L’extension pgstattuple permet un calcul plus rigoureux même si beaucoup plus lourd. Il comporte aussi une fonction d’approximation.

Voir cette page : Trouver la fragmentation d’un index avec pgstattuple

Estimer le bloat d’une table

Une estimation complète implique d’utiliser l’extension pgstattuple, qui est trop lent.

La requête suivante estime le bloat des tables. Elle est dérivée de https://github.com/ioguix/pgsql-bloat-estimation/blob/master/table/table_bloat.sql.

Warning

Les requêtes qui suivent supposent des statistiques à jour ! Vérifiez qu’autovacuum est passé assez récemment sur les tables concernées, ou lancez un VACUUM manuellement.

Warning

Les seuils à la fin de ces requêtes sont destinés à n’afficher que le bloat important sur des tables d’une certaine taille. Il faudra affiner selon votre installation.

Requête dérivée de https://github.com/ioguix/pgsql-bloat-estimation/blob/master/table/table_bloat.sql
-- Pour PG >= 9.2
-- Requête dérivée de https://github.com/ioguix/pgsql-bloat-estimation/blob/master/table/table_bloat.sql
-- Suppose que l'autovacuum (analyze & vacuum) sont passés assez récemment pour que les statistiques
-- utilisées soient fiables
-- Le filtrage des tables les plus intéressantes a lieu en fin de requête
SELECT current_database, schemaname, tblname AS  tablename, 
       pg_size_pretty(real_size::NUMERIC) AS real_size_, -- taille des fichiers de la table
       fillfactor, -- par défaut et généralement 100%
       pg_size_pretty(extra_size::NUMERIC) AS extra_size_, -- 
       pg_size_pretty(bloat_size::NUMERIC) AS bloat_size_, -- espace mort
       round(bloat_ratio::NUMERIC,1) AS bloat_ratio_
FROM (
/* WARNING: executed with a non-superuser role, the query inspect only tables and materialized view (9.3+) you are granted to read.
* This query is compatible with PostgreSQL 9.0 and more
*/
SELECT current_database(), schemaname, tblname, bs*tblpages AS real_size,
  (tblpages-est_tblpages)*bs AS extra_size,
  CASE WHEN tblpages - est_tblpages > 0
    THEN 100 * (tblpages - est_tblpages)/tblpages::FLOAT
    ELSE 0
  END AS extra_ratio, fillfactor,
  CASE WHEN tblpages - est_tblpages_ff > 0
    THEN (tblpages-est_tblpages_ff)*bs
    ELSE 0
  END AS bloat_size,
  CASE WHEN tblpages - est_tblpages_ff > 0
    THEN 100 * (tblpages - est_tblpages_ff)/tblpages::FLOAT
    ELSE 0
  END AS bloat_ratio, is_na
 -- , tpl_hdr_size, tpl_data_size, (pst).free_percent + (pst).dead_tuple_percent AS real_frag -- (DEBUG INFO)
FROM (
  SELECT CEIL( reltuples / ( (bs-page_hdr)/tpl_size ) ) + CEIL( toasttuples / 4 ) AS est_tblpages,
    CEIL( reltuples / ( (bs-page_hdr)*fillfactor/(tpl_size*100) ) ) + CEIL( toasttuples / 4 ) AS est_tblpages_ff,
    tblpages, fillfactor, bs, tblid, schemaname, tblname, heappages, toastpages, is_na
    -- , tpl_hdr_size, tpl_data_size, pgstattuple(tblid) AS pst -- (DEBUG INFO)
  FROM (
    SELECT
      ( 4 + tpl_hdr_size + tpl_data_size + (2*ma)
        - CASE WHEN tpl_hdr_size%ma = 0 THEN ma ELSE tpl_hdr_size%ma END
        - CASE WHEN CEIL(tpl_data_size)::int%ma = 0 THEN ma ELSE CEIL(tpl_data_size)::int%ma END
      ) AS tpl_size, bs - page_hdr AS size_per_block, (heappages + toastpages) AS tblpages, heappages,
      toastpages, reltuples, toasttuples, bs, page_hdr, tblid, schemaname, tblname, fillfactor, is_na
      -- , tpl_hdr_size, tpl_data_size
    FROM (
      SELECT
        tbl.oid AS tblid, ns.nspname AS schemaname, tbl.relname AS tblname, tbl.reltuples,
        tbl.relpages AS heappages, COALESCE(toast.relpages, 0) AS toastpages,
        COALESCE(toast.reltuples, 0) AS toasttuples,
        COALESCE(SUBSTRING(
          array_to_string(tbl.reloptions, ' ')
          FROM 'fillfactor=([0-9]+)')::SMALLINT, 100) AS fillfactor,
        current_setting('block_size')::NUMERIC AS bs,
        CASE WHEN version()~'mingw32' OR version()~'64-bit|x86_64|ppc64|ia64|amd64' THEN 8 ELSE 4 END AS ma,
        24 AS page_hdr,
        23 + CASE WHEN MAX(COALESCE(s.null_frac,0)) > 0 THEN ( 7 + COUNT(s.attname) ) / 8 ELSE 0::INT END
           + CASE WHEN bool_or(att.attname = 'oid' AND att.attnum < 0) THEN 4 ELSE 0 END AS tpl_hdr_size,
        SUM( (1-COALESCE(s.null_frac, 0)) * COALESCE(s.avg_width, 0) ) AS tpl_data_size,
        bool_or(att.atttypid = 'pg_catalog.name'::regtype)
          OR SUM(CASE WHEN att.attnum > 0 THEN 1 ELSE 0 END) <> COUNT(s.attname) AS is_na
      FROM pg_attribute AS att
        JOIN pg_class AS tbl ON att.attrelid = tbl.oid
        JOIN pg_namespace AS ns ON ns.oid = tbl.relnamespace
        LEFT JOIN pg_stats AS s ON s.schemaname=ns.nspname
          AND s.tablename = tbl.relname AND s.inherited=FALSE AND s.attname=att.attname
        LEFT JOIN pg_class AS toast ON tbl.reltoastrelid = toast.oid
      WHERE NOT att.attisdropped
        AND tbl.relkind IN ('r','m')
      GROUP BY 1,2,3,4,5,6,7,8,9,10
      ORDER BY 2,3
    ) AS s
  ) AS s2
) AS s3
-- WHERE NOT is_na
--   AND tblpages*((pst).free_percent + (pst).dead_tuple_percent)::float4/100 >= 1
--ORDER BY schemaname, tblname;
) r
WHERE  NOT is_na
-- schéma : supprimer ceux système (pg_catalog estparfois intéressant)
AND schemaname NOT IN ('information_schema', 'pg_catalog')
-- filtrage de taille - à ajuster
AND bloat_size > 5e6  -- taille perdue minimale par table
AND bloat_ratio >= 50 -- bloat minimal (en %)
--AND tblname = 'nomdetable'
ORDER BY bloat_size DESC
;

Estimer le bloat d’un index Btree

Le bloat d’un index Btree est moins impactant pour les performances que pour une table. Estimer le bloat permet cependant de cibler quelques candidats à la réindexation :

Warning

Encore plus que pour les tables, cette requête ne ramène que des estimations, notamment dans le cas des colonnes textes (qui peuvent être toastées). La déduplication dans PG13 peut aussi amener de faux positifs.

La requête suivante estime le bloat des index. Elle est dérivée de https://github.com/ioguix/pgsql-bloat-estimation/blob/master/btree/btree_bloat.sql.

Requête dérivée de https://github.com/ioguix/pgsql-bloat-estimation/blob/master/btree/btree_bloat.sql
-- Pour PG >= 9.2
-- Requête dérivée de https://github.com/ioguix/pgsql-bloat-estimation/blob/master/btree/btree_bloat.sql
-- Suppose que l'autovacuum (analyze & vacuum) sont passés assez récemment pour que les statistiques
-- utilisées soient fiables
-- Le filtrage des index les plus intéressants a lieu en fin de requête
SELECT current_database, schemaname, tblname AS  tablename,  idxname AS indexname,
       pg_size_pretty(real_size::NUMERIC) AS real_size_, -- taille des fichiers de la table
       fillfactor, -- par défaut et généralement 100%
       pg_size_pretty(extra_size::NUMERIC) AS extra_size_, -- 
       pg_size_pretty(bloat_size::NUMERIC) AS bloat_size_, -- espace mort
       round(bloat_ratio::NUMERIC,1) AS bloat_ratio_
FROM (
-- WARNING: executed with a non-superuser role, the query inspect only index on tables you are granted to read.
-- WARNING: rows with is_na = 't' are known to have bad statistics ("name" type is not supported).
-- This query is compatible with PostgreSQL 8.2 and after
SELECT current_database(), nspname AS schemaname, tblname, idxname, bs*(relpages)::BIGINT AS real_size,
  bs*(relpages-est_pages)::BIGINT AS extra_size,
  100 * (relpages-est_pages)::FLOAT / relpages AS extra_ratio,
  fillfactor,
  CASE WHEN relpages > est_pages_ff
    THEN bs*(relpages-est_pages_ff)
    ELSE 0
  END AS bloat_size,
  100 * (relpages-est_pages_ff)::FLOAT / relpages AS bloat_ratio,
  is_na
  -- , 100-(pst).avg_leaf_density AS pst_avg_bloat, est_pages, index_tuple_hdr_bm, maxalign, pagehdr, nulldatawidth, nulldatahdrwidth, reltuples, relpages -- (DEBUG INFO)
FROM (
  SELECT COALESCE(1 +
         CEIL(reltuples/FLOOR((bs-pageopqdata-pagehdr)/(4+nulldatahdrwidth)::FLOAT)), 0 -- ItemIdData size + computed avg size of a tuple (nulldatahdrwidth)
      ) AS est_pages,
      COALESCE(1 +
         CEIL(reltuples/FLOOR((bs-pageopqdata-pagehdr)*fillfactor/(100*(4+nulldatahdrwidth)::FLOAT))), 0
      ) AS est_pages_ff,
      bs, nspname, tblname, idxname, relpages, fillfactor, is_na
      -- , pgstatindex(idxoid) AS pst, index_tuple_hdr_bm, maxalign, pagehdr, nulldatawidth, nulldatahdrwidth, reltuples -- (DEBUG INFO)
  FROM (
      SELECT maxalign, bs, nspname, tblname, idxname, reltuples, relpages, idxoid, fillfactor,
            ( index_tuple_hdr_bm +
                maxalign - CASE -- Add padding to the index tuple header to align on MAXALIGN
                  WHEN index_tuple_hdr_bm%maxalign = 0 THEN maxalign
                  ELSE index_tuple_hdr_bm%maxalign
                END
              + nulldatawidth + maxalign - CASE -- Add padding to the data to align on MAXALIGN
                  WHEN nulldatawidth = 0 THEN 0
                  WHEN nulldatawidth::integer%maxalign = 0 THEN maxalign
                  ELSE nulldatawidth::integer%maxalign
                END
            )::NUMERIC AS nulldatahdrwidth, pagehdr, pageopqdata, is_na
            -- , index_tuple_hdr_bm, nulldatawidth -- (DEBUG INFO)
      FROM (
          SELECT n.nspname, i.tblname, i.idxname, i.reltuples, i.relpages,
              i.idxoid, i.fillfactor, current_setting('block_size')::NUMERIC AS bs,
              CASE -- MAXALIGN: 4 on 32bits, 8 on 64bits (and mingw32 ?)
                WHEN version() ~ 'mingw32' OR version() ~ '64-bit|x86_64|ppc64|ia64|amd64' THEN 8
                ELSE 4
              END AS maxalign,
              /* per page header, fixed size: 20 for 7.X, 24 for others */
              24 AS pagehdr,
              /* per page btree opaque data */
              16 AS pageopqdata,
              /* per tuple header: add IndexAttributeBitMapData if some cols are null-able */
              CASE WHEN MAX(COALESCE(s.null_frac,0)) = 0
                  THEN 2 -- IndexTupleData size
                  ELSE 2 + (( 32 + 8 - 1 ) / 8) -- IndexTupleData size + IndexAttributeBitMapData size ( max num filed per index + 8 - 1 /8)
              END AS index_tuple_hdr_bm,
              /* data len: we remove null values save space using it fractionnal part from stats */
              SUM( (1-COALESCE(s.null_frac, 0)) * COALESCE(s.avg_width, 1024)) AS nulldatawidth,
              MAX( CASE WHEN i.atttypid = 'pg_catalog.name'::regtype THEN 1 ELSE 0 END ) > 0 AS is_na
          FROM (
              SELECT ct.relname AS tblname, ct.relnamespace, ic.idxname, ic.attpos, ic.indkey, ic.indkey[ic.attpos], ic.reltuples, ic.relpages, ic.tbloid, ic.idxoid, ic.fillfactor,
                  COALESCE(a1.attnum, a2.attnum) AS attnum, COALESCE(a1.attname, a2.attname) AS attname, COALESCE(a1.atttypid, a2.atttypid) AS atttypid,
                  CASE WHEN a1.attnum IS NULL
                  THEN ic.idxname
                  ELSE ct.relname
                  END AS attrelname
              FROM (
                  SELECT idxname, reltuples, relpages, tbloid, idxoid, fillfactor, indkey,
                      pg_catalog.generate_series(1,indnatts) AS attpos
                  FROM (
                      SELECT ci.relname AS idxname, ci.reltuples, ci.relpages, i.indrelid AS tbloid,
                          i.indexrelid AS idxoid,
                          COALESCE(SUBSTRING(
                              array_to_string(ci.reloptions, ' ')
                              FROM 'fillfactor=([0-9]+)')::SMALLINT, 90) AS fillfactor,
                          i.indnatts,
                          pg_catalog.string_to_array(pg_catalog.textin(
                              pg_catalog.int2vectorout(i.indkey)),' ')::INT[] AS indkey
                      FROM pg_catalog.pg_index i
                      JOIN pg_catalog.pg_class ci ON ci.oid = i.indexrelid
                      WHERE ci.relam=(SELECT oid FROM pg_am WHERE amname = 'btree')
                      AND ci.relpages > 0
                  ) AS idx_data
              ) AS ic
              JOIN pg_catalog.pg_class ct ON ct.oid = ic.tbloid
              LEFT JOIN pg_catalog.pg_attribute a1 ON
                  ic.indkey[ic.attpos] <> 0
                  AND a1.attrelid = ic.tbloid
                  AND a1.attnum = ic.indkey[ic.attpos]
              LEFT JOIN pg_catalog.pg_attribute a2 ON
                  ic.indkey[ic.attpos] = 0
                  AND a2.attrelid = ic.idxoid
                  AND a2.attnum = ic.attpos
            ) i
            JOIN pg_catalog.pg_namespace n ON n.oid = i.relnamespace
            JOIN pg_catalog.pg_stats s ON s.schemaname = n.nspname
                                      AND s.tablename = i.attrelname
                                      AND s.attname = i.attname
            GROUP BY 1,2,3,4,5,6,7,8,9,10,11
      ) AS rows_data_stats
  ) AS rows_hdr_pdg_stats
) AS relation_stats
--ORDER BY nspname, tblname, idxname;
) r
WHERE  NOT is_na
-- schéma : supprimer ceux système (pg_catalog estparfois intéressant)
AND schemaname NOT IN ('information_schema', 'pg_catalog')
-- filtrage de taille - à ajuster
AND bloat_size > 5e6  -- taille perdue minimale
AND bloat_ratio >= 50 -- bloat minimal (en %)
--AND tblname = 'nomdetable'
ORDER BY bloat_size DESC
;