[ADMIN] problems with access into system catalogs

Discussion:

(too old to reply)

Thomas Markus

2012-07-26 07:44:21 UTC

Hi,

I have 2 systems with CentOS 5.5 (2.6.18) x86_64, postgresql-9.0.6 64bit

both systems contains the same content. But one system make troubles.
some system tables (eg pg_catalog.pg_class or pg_attribute) contain much
dead rows and all simple query take much time on one system. the other
one is fast.

one problem is this query:
psql MYDB -c "explain analyze SELECT column_name FROM
information_schema.columns WHERE table_schema='public' AND
table_name='person'"

fast system:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=96.39..196.63 rows=1 width=64) (actual
time=10.953..12.785 rows=24 loops=1)
-> Hash Left Join (cost=96.39..196.34 rows=1 width=68) (actual
time=10.937..12.541 rows=24 loops=1)
Hash Cond: (t.typbasetype = bt.oid)
Join Filter: (t.typtype = 'd'::"char")
-> Nested Loop (cost=0.00..99.93 rows=1 width=73) (actual
time=0.821..2.359 rows=24 loops=1)
-> Nested Loop (cost=0.00..99.62 rows=1 width=68)
(actual time=0.811..2.097 rows=24 loops=1)
Join Filter: (pg_has_role(c.relowner,
'USAGE'::text) OR has_column_privilege(c.oid, a.attnum, 'SELECT, INSERT,
UPDATE, REFERENCES'::text))
-> Nested Loop (cost=0.00..85.94 rows=1 width=8)
(actual time=0.749..1.904 rows=1 loops=1)
Join Filter: (c.relnamespace = nc.oid)
-> Seq Scan on pg_namespace nc
(cost=0.00..4.00 rows=1 width=4) (actual time=0.082..0.502 rows=1 loops=1)
Filter: ((NOT
pg_is_other_temp_schema(oid)) AND
(((nspname)::information_schema.sql_identifier)::text = 'public'::text))
-> Seq Scan on pg_class c (cost=0.00..81.91
rows=3 width=12) (actual time=0.658..1.389 rows=1 loops=1)
Filter: ((c.relkind = ANY
('{r,v}'::"char"[])) AND
(((c.relname)::information_schema.sql_identifier)::text = 'person'::text))
-> Index Scan using
pg_attribute_relid_attnum_index on pg_attribute a (cost=0.00..13.61
rows=4 width=74) (actual time=0.048..0.103 rows=24 loops=1)
Index Cond: ((a.attrelid = c.oid) AND
(a.attnum > 0))
Filter: (NOT a.attisdropped)
-> Index Scan using pg_type_oid_index on pg_type t
(cost=0.00..0.29 rows=1 width=13) (actual time=0.004..0.005 rows=1 loops=24)
Index Cond: (t.oid = a.atttypid)
-> Hash (cost=77.14..77.14 rows=1540 width=4) (actual
time=10.082..10.082 rows=1540 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 55kB
-> Hash Join (cost=4.57..77.14 rows=1540 width=4)
(actual time=0.317..7.597 rows=1540 loops=1)
Hash Cond: (bt.typnamespace = nbt.oid)
-> Seq Scan on pg_type bt (cost=0.00..51.40
rows=1540 width=8) (actual time=0.003..2.894 rows=1540 loops=1)
-> Hash (cost=3.14..3.14 rows=114 width=4)
(actual time=0.294..0.294 rows=95 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 4kB
-> Seq Scan on pg_namespace nbt
(cost=0.00..3.14 rows=114 width=4) (actual time=0.004..0.147 rows=95
loops=1)
-> Index Scan using pg_namespace_oid_index on pg_namespace nt
(cost=0.00..0.28 rows=1 width=4) (actual time=0.003..0.004 rows=1 loops=24)
Index Cond: (nt.oid = t.typnamespace)
Total runtime: 13.130 ms
(29 rows)

but the problematic:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=153770.68..475719.06 rows=33 width=64) (actual
time=716.941..1993.033 rows=24 loops=1)
Hash Cond: (t.typbasetype = bt.oid)
Join Filter: (t.typtype = 'd'::"char")
-> Hash Join (cost=58705.27..374606.83 rows=33 width=69) (actual
time=373.927..1649.818 rows=24 loops=1)
Hash Cond: (t.typnamespace = nt.oid)
-> Nested Loop (cost=58700.55..374601.66 rows=33 width=73)
(actual time=373.519..1649.339 rows=24 loops=1)
-> Hash Join (cost=58700.55..374338.49 rows=33
width=68) (actual time=373.496..1649.061 rows=24 loops=1)
Hash Cond: (a.attrelid = c.oid)
Join Filter: (pg_has_role(c.relowner,
'USAGE'::text) OR has_column_privilege(c.oid, a.attnum, 'SELECT, INSERT,
UPDATE, REFERENCES'::text))
-> Seq Scan on pg_attribute a
(cost=0.00..294729.54 rows=5575330 width=74) (actual
time=0.011..1279.604 rows=8452 loops=1)
Filter: ((NOT attisdropped) AND (attnum > 0))
-> Hash (cost=58700.34..58700.34 rows=17
width=8) (actual time=358.699..358.699 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Nested Loop (cost=0.00..58700.34
rows=17 width=8) (actual time=0.786..358.693 rows=1 loops=1)
Join Filter: (c.relnamespace = nc.oid)
-> Seq Scan on pg_namespace nc
(cost=0.00..4.12 rows=1 width=4) (actual time=0.184..0.523 rows=1 loops=1)
Filter: ((NOT
pg_is_other_temp_schema(oid)) AND
(((nspname)::information_schema.sql_identifier)::text = 'public'::text))
-> Seq Scan on pg_class c
(cost=0.00..58671.00 rows=2018 width=12) (actual time=0.590..358.151
rows=1 loops=1)
Filter: ((c.relkind = ANY
('{r,v}'::"char"[])) AND
(((c.relname)::information_schema.sql_identifier)::text = 'person'::text))
-> Index Scan using pg_type_oid_index on pg_type t
(cost=0.00..7.96 rows=1 width=13) (actual time=0.005..0.006 rows=1 loops=24)
Index Cond: (t.oid = a.atttypid)
-> Hash (cost=3.21..3.21 rows=121 width=4) (actual
time=0.364..0.364 rows=121 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 5kB
-> Seq Scan on pg_namespace nt (cost=0.00..3.21
rows=121 width=4) (actual time=0.009..0.189 rows=121 loops=1)
-> Hash (cost=69682.20..69682.20 rows=1547136 width=4) (actual
time=341.761..341.761 rows=1569 loops=1)
Buckets: 131072 Batches: 2 Memory Usage: 29kB
-> Hash Join (cost=4.72..69682.20 rows=1547136 width=4)
(actual time=0.370..339.147 rows=1569 loops=1)
Hash Cond: (bt.typnamespace = nbt.oid)
-> Seq Scan on pg_type bt (cost=0.00..48404.36
rows=1547136 width=8) (actual time=0.005..334.474 rows=1569 loops=1)
-> Hash (cost=3.21..3.21 rows=121 width=4) (actual
time=0.342..0.342 rows=121 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 5kB
-> Seq Scan on pg_namespace nbt (cost=0.00..3.21
rows=121 width=4) (actual time=0.004..0.162 rows=121 loops=1)
Total runtime: 1993.769 ms
(33 rows)

I tried to cluster, reindex and vacuum (+full) pg_class without success
psql MYDB -c "vacuum verbose pg_catalog.pg_class"

fast system:
INFO: "pg_class": found 1 removable, 2106 nonremovable row versions in
45 out of 45 pages
DETAIL: 0 dead row versions cannot be removed yet.

other:
INFO: "pg_class": found 0 removable, 1547667 nonremovable row versions
in 31587 out of 31587 pages
DETAIL: 1545530 dead row versions cannot be removed yet.

a "select count(*) from pg_catalog.pg_class" returns only 2137

how can i remove then dead rows and fasten acccess to system catalogs?

Thomas

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Craig Ringer

2012-07-26 08:25:22 UTC

Permalink

First, thank-you for an excellent complete question with versions,
EXPLAIN ANALYZE, and exact messages.

My reply is interleaved below.

Post by Thomas Markus
I have 2 systems with CentOS 5.5 (2.6.18) x86_64, postgresql-9.0.6 64bit
both systems contains the same content. But one system make troubles.
some system tables (eg pg_catalog.pg_class or pg_attribute) contain
much dead rows and all simple query take much time on one system. the
other one is fast.

[snip]

Post by Thomas Markus
INFO: "pg_class": found 0 removable, 1547667 nonremovable row
versions in 31587 out of 31587 pages
DETAIL: 1545530 dead row versions cannot be removed yet.
a "select count(*) from pg_catalog.pg_class" returns only 2137

OK, so you have lots of bloat in the system catalogs. A few things to
check for:

- Do you have any uncommitted two phase transactions? Run:
SELECT * from pg_prepared_xacts ;

- Do you have any long-lived 'IDLE IN TRANSACTION' connections ? Try:
SELECT * FROM pg_stat_activity WHERE current_query = '<IDLE> in
transaction' AND xact_start > current_timestamp - '1 minute'::interval;

Either of those can prevent vacuum from cleaning things up.

Do you very frequently create and drop tables, indexes, etc? Say, using
a database unit testing framework?

--
Craig Ringer

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Thomas Markus

2012-07-26 08:39:55 UTC

Permalink

Hi,

see below

Post by Craig Ringer
First, thank-you for an excellent complete question with versions,
EXPLAIN ANALYZE, and exact messages.

;)

Post by Craig Ringer
SELECT * from pg_prepared_xacts ;

hm yes, i stopped all applications this morning but this query shows:
transaction | gid |
prepared | owner | database
-------------+----------------------------------------------------------------------------------------------+-------------------------------+--------+-----------
49737548 |
131075_MS03ZjAwMDAwMTpjZmVlOjRlZDg3MTk2OjY4NGJlN2I=_N2YwMDAwMDE6Y2ZlZTo0ZWQ4NzE5Njo2ODRiZTdm
| 2012-01-05 07:49:30.78583+01 | xxx | db1
49737549 |
131075_MS03ZjAwMDAwMTpjZmVlOjRlZDg3MTk2OjY4NGJlN2I=_N2YwMDAwMDE6Y2ZlZTo0ZWQ4NzE5Njo2ODRiZTg0
| 2012-01-05 07:49:30.789382+01 | xxx | db2

system time is valid (Thu Jul 26 10:38:12 CEST 2012). so may 1st is
really old
Should I restart the instance?

Post by Craig Ringer
SELECT * FROM pg_stat_activity WHERE current_query = '<IDLE> in
transaction' AND xact_start > current_timestamp - '1 minute'::interval;

none, and there was a disconnect from all clients this morning

Post by Craig Ringer
Either of those can prevent vacuum from cleaning things up.
Do you very frequently create and drop tables, indexes, etc? Say,
using a database unit testing framework?

no, its a live system with normal olap access

Thomas

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Craig Ringer

2012-07-26 09:30:33 UTC

Permalink

Post by Thomas Markus
Hi,
see below

Post by Craig Ringer
SELECT * from pg_prepared_xacts ;

transaction | gid |
prepared | owner | database
-------------+----------------------------------------------------------------------------------------------+-------------------------------+--------+-----------
49737548 |
131075_MS03ZjAwMDAwMTpjZmVlOjRlZDg3MTk2OjY4NGJlN2I=_N2YwMDAwMDE6Y2ZlZTo0ZWQ4NzE5Njo2ODRiZTdm
| 2012-01-05 07:49:30.78583+01 | xxx | db1
49737549 |
131075_MS03ZjAwMDAwMTpjZmVlOjRlZDg3MTk2OjY4NGJlN2I=_N2YwMDAwMDE6Y2ZlZTo0ZWQ4NzE5Njo2ODRiZTg0
| 2012-01-05 07:49:30.789382+01 | xxx | db2
system time is valid (Thu Jul 26 10:38:12 CEST 2012). so may 1st is
really old
Should I restart the instance?

Nope, and it wouldn't help anyway. Prepared but uncommitted two phase
transactions are a permanent and persistent part of the database. They
only go away when a COMMIT PREPARED or ROLLBACK PREPARED is issued. See:

http://www.postgresql.org/docs/9.1/static/sql-prepare-transaction.html

I cannot advise you on what to do without knowing what created those
transactions and why.

Post by Thomas Markus

Post by Craig Ringer
Do you very frequently create and drop tables, indexes, etc? Say,
using a database unit testing framework?

no, its a live system with normal olap access

Weird, then I don't know how the catalogs would get so fat.

I don't think temporary tables create writes to the catalog heap, but I
can't think what else it'd be.

--
Craig Ringer

Thomas Markus

2012-07-26 11:12:15 UTC

Permalink

Hi Craig,

thanks, that was the tip

a rollback prepared on these does the job

best regards
Thomas

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Tom Lane

2012-07-26 14:45:51 UTC

Permalink

Post by Thomas Markus
a rollback prepared on these does the job

If you aren't intentionally using prepared transactions, it's a good
idea to disable them by setting max_prepared_transactions to zero
(DB restart required). That prevents you from accidentally shooting
yourself in the foot like this.

regards, tom lane

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin