[ADMIN] Hot standby having high requested checkpoints?

Discussion:

(too old to reply)

Brian Fehrle

2012-02-28 22:53:20 UTC

Hi all,

I have two PostgreSQL 9.0.5 clusters, one is a master and the other is a
hot standby via streaming replication. I'm monitoring some stats on each
and I'm noticing something very odd. On the master, I get between 2 and
4 requested checkpoints per hour, but on the hot standby I'm seeing
between 200 and 300 requested checkpoints per hour.

Both boxes are the same specs, and have the same following config
parameters:
name | setting
------------------------------+---------
bgwriter_delay | 200
bgwriter_lru_maxpages | 350
bgwriter_lru_multiplier | 2
checkpoint_completion_target | 0.5
checkpoint_segments | 256
checkpoint_timeout | 1800

Is there any reason why I would be seeing this type of behavior on the
hot standby? Is it the standard type of behavior on a hot standby that
I'm only now noticing?

Thanks in advance,
- Brian F

Fujii Masao

2012-02-29 04:22:33 UTC

Permalink

On Wed, Feb 29, 2012 at 7:53 AM, Brian Fehrle

Post by Brian Fehrle
Hi all,
I have two PostgreSQL 9.0.5 clusters, one is a master and the other is a hot
standby via streaming replication. I'm monitoring some stats on each and I'm
noticing something very odd. On the master, I get between 2 and 4 requested
checkpoints per hour, but on the hot standby I'm seeing between 200 and 300
requested checkpoints per hour.

How did you get that information? Enable log_checkpoints?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Brian Fehrle

2012-02-29 15:14:45 UTC

Permalink

I am taking periodic snapshots from the system view 'pg_stat_bgwriter'
from both the master and the hot standby.

- Brian F

Post by Fujii Masao
On Wed, Feb 29, 2012 at 7:53 AM, Brian Fehrle

How did you get that information? Enable log_checkpoints?
Regards,

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Fujii Masao

2012-03-01 02:17:59 UTC

Permalink

On Thu, Mar 1, 2012 at 12:14 AM, Brian Fehrle

I am taking periodic snapshots from the system view 'pg_stat_bgwriter' from
both the master and the hot standby.

You mean that you observed that pg_stat_bgwriter.checkpoints_timed in
the standby was larger than that in the master? I ran pgbench on streaming
replication and checked pg_stat_bgwriter, but I could not reproduce the
problem you encountered. pg_stat_bgwriter.checkpoints_timed in the
standby was equal to or one more than that in the master. Could you tell me
the self-contained test case?

Regards,

Simon Riggs

2012-03-01 08:35:01 UTC

Permalink

On Tue, Feb 28, 2012 at 10:53 PM, Brian Fehrle

Post by Brian Fehrle
I have two PostgreSQL 9.0.5 clusters, one is a master and the other is a hot
standby via streaming replication. I'm monitoring some stats on each and I'm
noticing something very odd. On the master, I get between 2 and 4 requested
checkpoints per hour, but on the hot standby I'm seeing between 200 and 300
requested checkpoints per hour.

We don't do a restartpoint on the standby unless we see a checkpoint
record, so that result should be impossible.

So I'm guessing you're reading the stats wrong?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Brian Fehrle

2012-03-01 23:38:16 UTC

Permalink

I just now ran the following query a few times after each other on the
hot standby:

select now(), * from pg_stat_bgwriter;

Here are the results:

now | checkpoints_timed | checkpoints_req |
buffers_checkpoint | buffers_clean | maxwritten_clean | buffers_backend
| buffers_alloc
-------------------------------+-------------------+-----------------+--------------------+---------------+------------------+-----------------+---------------
2012-03-01 23:24:49.099194+00 | 11546 | 145300
| 1000409459 | 38483026 | 170724 |
5058186 | 438703950
2012-03-01 23:24:52.139176+00 | 11546 | 145300
| 1000409459 | 38483027 | 170724 |
5058191 | 438703975
2012-03-01 23:24:59.129171+00 | 11546 | 145302
| 1000409459 | 38483081 | 170724 |
5058214 | 438704438
2012-03-01 23:25:05.957532+00 | 11546 | 145304
| 1000409459 | 38483118 | 170724 |
5058230 | 438704689
2012-03-01 23:25:09.519175+00 | 11546 | 145305
| 1000409459 | 38483160 | 170724 |
5058239 | 438704968
2012-03-01 23:25:37.019194+00 | 11546 | 145308
| 1000409459 | 38483259 | 170724 |
5058255 | 438705566
2012-03-01 23:25:40.659164+00 | 11546 | 145308
| 1000409459 | 38483268 | 170724 |
5058257 | 438705639
2012-03-01 23:25:47.239281+00 | 11546 | 145309
| 1000409459 | 38483283 | 170724 |
5058266 | 438705815
2012-03-01 23:26:23.858716+00 | 11546 | 145312
| 1000409459 | 38483393 | 170724 |
5058307 | 438706561
2012-03-01 23:26:46.467493+00 | 11546 | 145317
| 1000409670 | 38483524 | 170724 |
5058354 | 438707619

in two minutes, I saw 17 checkpoints_req, and the number of
buffers_checkpoint didn't budge till the last few and even then not much.

checkpoint_segments = 256

- Brian F

Post by Simon Riggs
On Tue, Feb 28, 2012 at 10:53 PM, Brian Fehrle

We don't do a restartpoint on the standby unless we see a checkpoint
record, so that result should be impossible.
So I'm guessing you're reading the stats wrong?

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Brian Fehrle

2012-03-08 22:16:01 UTC

Permalink

Anyone have any thoughts on why this may be happening?

thanks,
- Brian F

Post by Brian Fehrle
I just now ran the following query a few times after each other on the
select now(), * from pg_stat_bgwriter;
now | checkpoints_timed | checkpoints_req |
buffers_checkpoint | buffers_clean | maxwritten_clean |
buffers_backend | buffers_alloc
-------------------------------+-------------------+-----------------+--------------------+---------------+------------------+-----------------+---------------
2012-03-01 23:24:49.099194+00 | 11546 | 145300
| 1000409459 | 38483026 | 170724 |
5058186 | 438703950
2012-03-01 23:24:52.139176+00 | 11546 | 145300
| 1000409459 | 38483027 | 170724 |
5058191 | 438703975
2012-03-01 23:24:59.129171+00 | 11546 | 145302
| 1000409459 | 38483081 | 170724 |
5058214 | 438704438
2012-03-01 23:25:05.957532+00 | 11546 | 145304
| 1000409459 | 38483118 | 170724 |
5058230 | 438704689
2012-03-01 23:25:09.519175+00 | 11546 | 145305
| 1000409459 | 38483160 | 170724 |
5058239 | 438704968
2012-03-01 23:25:37.019194+00 | 11546 | 145308
| 1000409459 | 38483259 | 170724 |
5058255 | 438705566
2012-03-01 23:25:40.659164+00 | 11546 | 145308
| 1000409459 | 38483268 | 170724 |
5058257 | 438705639
2012-03-01 23:25:47.239281+00 | 11546 | 145309
| 1000409459 | 38483283 | 170724 |
5058266 | 438705815
2012-03-01 23:26:23.858716+00 | 11546 | 145312
| 1000409459 | 38483393 | 170724 |
5058307 | 438706561
2012-03-01 23:26:46.467493+00 | 11546 | 145317
| 1000409670 | 38483524 | 170724 |
5058354 | 438707619
in two minutes, I saw 17 checkpoints_req, and the number of
buffers_checkpoint didn't budge till the last few and even then not much.
checkpoint_segments = 256
- Brian F

Post by Simon Riggs
On Tue, Feb 28, 2012 at 10:53 PM, Brian Fehrle

We don't do a restartpoint on the standby unless we see a checkpoint
record, so that result should be impossible.
So I'm guessing you're reading the stats wrong?

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin