Subhasis Bhattacharya
2012-11-23 11:32:37 UTC
Hi All,
Recently a postgres database embedded within our product faced a series of
hiccups at a customer site, as follows:
For about 24 h ours the postgres log file had logged errors like:
2012-09-24 00:00:12 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:13 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:15 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:16 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:20 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:21 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:22 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:22 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:23 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:24 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
....
Its an extremely volatile database with rows deleted very often....
After about 24 hours we saw this:
2012-10-01 00:19:21 GMTWARNING: relation "pg_toast.pg_toast_16509"
contains more than "max_fsm_pages" pages with useful free space
2012-10-01 00:19:21 GMTHINT: Consider using VACUUM FULL on this relation
or increasing the configuration parameter "max_fsm_pages".
2012-10-01 00:57:12 GMTWARNING: relation "pbs.job_attr" contains more than
"max_fsm_pages" pages with useful free space
2012-10-01 00:57:12 GMTHINT: Consider using VACUUM FULL on this relation
or increasing the configuration parameter "max_fsm_pages".
2012-10-01 01:09:30 GMTWARNING: relation "pg_toast.pg_toast_16509"
contains more than "max_fsm_pages" pages with useful free space
2012-10-01 01:09:30 GMTHINT: Consider using VACUUM FULL on this relation
or increasing the configuration parameter "max_fsm_pages".
2012-10-01 01:18:45 GMTERROR: canceling autovacuum task
So looks like autovacuum encountered a situation it could not handle and
was asking for a manual vacuum full. Queries henceforth ran very very slow,
since autovacuum kept popping up and cancelling itself...
My question to the postgres guru's or developer's:
- I read that the autovacuum daemon depends on the statistics collector to
do its job properly. Could it be that the fact that the collector could not
update the pgstat file earlier for over 24 hours or so, led autovacuum to
NOT do its job leading to the situation where autovacuum could no longer
handle it....
- In a normal situation, should we assume that autovacuum must be able to
do it job without needing a manual vacuum full ever?
Thanks and Regards,
Subhasis Bhattacharya
Recently a postgres database embedded within our product faced a series of
hiccups at a customer site, as follows:
For about 24 h ours the postgres log file had logged errors like:
2012-09-24 00:00:12 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:13 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:15 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:16 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:20 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:21 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:22 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:22 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:23 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:24 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
....
Its an extremely volatile database with rows deleted very often....
After about 24 hours we saw this:
2012-10-01 00:19:21 GMTWARNING: relation "pg_toast.pg_toast_16509"
contains more than "max_fsm_pages" pages with useful free space
2012-10-01 00:19:21 GMTHINT: Consider using VACUUM FULL on this relation
or increasing the configuration parameter "max_fsm_pages".
2012-10-01 00:57:12 GMTWARNING: relation "pbs.job_attr" contains more than
"max_fsm_pages" pages with useful free space
2012-10-01 00:57:12 GMTHINT: Consider using VACUUM FULL on this relation
or increasing the configuration parameter "max_fsm_pages".
2012-10-01 01:09:30 GMTWARNING: relation "pg_toast.pg_toast_16509"
contains more than "max_fsm_pages" pages with useful free space
2012-10-01 01:09:30 GMTHINT: Consider using VACUUM FULL on this relation
or increasing the configuration parameter "max_fsm_pages".
2012-10-01 01:18:45 GMTERROR: canceling autovacuum task
So looks like autovacuum encountered a situation it could not handle and
was asking for a manual vacuum full. Queries henceforth ran very very slow,
since autovacuum kept popping up and cancelling itself...
My question to the postgres guru's or developer's:
- I read that the autovacuum daemon depends on the statistics collector to
do its job properly. Could it be that the fact that the collector could not
update the pgstat file earlier for over 24 hours or so, led autovacuum to
NOT do its job leading to the situation where autovacuum could no longer
handle it....
- In a normal situation, should we assume that autovacuum must be able to
do it job without needing a manual vacuum full ever?
Thanks and Regards,
Subhasis Bhattacharya