Хостимся на селектеле, настроена локалка между серверами и начали упираться в 1Гб, пришлось заказывать кастовый сервер с сетевой 10Гб, приложение на новом сервере так же работает и не тормозит, но есть проблемы с метриками, которые отправляются в prometheus и datadog
метрики которые отправляются в datadog вообще не приходят и графики все пустые, сравнил некоторые запросы на новом сервере и на старом и вылезла проблема
старый сервер
postgres=# explain analyze SELECT psd.datname, numbackends, xact_commit, xact_rollback, blks_read, blks_hit, tup_returned, tup_fetched, tup_inserted, tup_updated, tup_deleted, 2 ^ 31 - age(datfrozenxid) as wraparound, deadlocks, temp_bytes, temp_files, pg_database_size(psd.datname) as pg_database_size FROM pg_stat_database psd JOIN pg_database pd ON psd.datname = pd.datname WHERE psd.datname not ilike 'template%%%%' AND psd.datname not ilike 'rdsadmin' AND psd.datname not ilike 'azure_maintenance' AND psd.datname not ilike 'postgres';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=0.24..0.82 rows=6 width=180) (actual time=14.722..16.546 rows=3 loops=1)
Hash Cond: (d.datname = pd.datname)
-> Seq Scan on pg_database d (cost=0.00..0.22 rows=6 width=68) (actual time=0.028..0.053 rows=3 loops=1)
Filter: ((datname !~~* 'template%%%%'::text) AND (datname !~~* 'rdsadmin'::text) AND (datname !~~* 'azure_maintenance'::text) AND (datname !~~* 'postgres'::text))
Rows Removed by Filter: 3
-> Hash (cost=0.16..0.16 rows=6 width=68) (actual time=0.009..0.009 rows=6 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on pg_database pd (cost=0.00..0.16 rows=6 width=68) (actual time=0.003..0.004 rows=6 loops=1)
Planning time: 0.370 ms
Execution time: 16.598 ms
(10 rows)
новый сервер
postgres=# explain analyze SELECT psd.datname, numbackends, xact_commit, xact_rollback, blks_read, blks_hit, tup_returned, tup_fetched, tup_inserted, tup_updated, tup_deleted, 2 ^ 31 - age(datfrozenxid) as wraparound, deadlocks, temp_bytes, temp_files, pg_database_size(psd.datname) as pg_database_size FROM pg_stat_database psd JOIN pg_database pd ON psd.datname = pd.datname WHERE psd.datname not ilike 'template%%%%' AND psd.datname not ilike 'rdsadmin' AND psd.datname not ilike 'azure_maintenance' AND psd.datname not ilike 'postgres';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=0.24..0.82 rows=6 width=180) (actual time=10151.216..10153.786 rows=3 loops=1)
Hash Cond: (d.datname = pd.datname)
-> Seq Scan on pg_database d (cost=0.00..0.22 rows=6 width=68) (actual time=0.036..0.075 rows=3 loops=1)
Filter: ((datname !~~* 'template%%%%'::text) AND (datname !~~* 'rdsadmin'::text) AND (datname !~~* 'azure_maintenance'::text) AND (datname !~~* 'postgres'::text))
Rows Removed by Filter: 3
-> Hash (cost=0.16..0.16 rows=6 width=68) (actual time=0.017..0.018 rows=6 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on pg_database pd (cost=0.00..0.16 rows=6 width=68) (actual time=0.004..0.006 rows=6 loops=1)
Planning time: 0.489 ms
Execution time: 10153.864 ms
очень долго как-то выполняются запросы на новом сервере, куда копать, что проверять?