[ADMIN] pg_basebackup blocking all queries with horrible performance

Post by Lonni J Friedman
Greetings,
I have a 4 server postgresql-9.1.3 cluster (one master doing streaming
replication to 3 hot standby servers). All of them are running
Fedora-16-x86_64.
http://wiki.postgresql.org/wiki/Lock_Monitoring

err, i included that URL but neglected to explain why. On a different
list someone suggested that I verify that there were no locks that
were blocking things, and I did so, and found no locks.

So I'm still at a loss why pg_basebackup is killing perf, and would
appreciate pointers on how to debug it or at least reduce its impact
on performance if that is possible.

tahnks

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Magnus Hagander

2012-06-07 19:40:45 UTC

err, i included that URL but neglected to explain why. On a different
list someone suggested that I verify that there were no locks that
were blocking things, and I did so, and found no locks.
So I'm still at a loss why pg_basebackup is killing perf, and would
appreciate pointers on how to debug it or at least reduce its impact
on performance if that is possible.

My guess would be that you are overloading your I/O system. You should
look at values from iostat and vmstat from when the system works fine
and when you run pg_basebackup, that should give you a hint in the
right direction.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Lonni J Friedman

2012-06-07 20:08:27 UTC

err, i included that URL but neglected to explain why. On a different
list someone suggested that I verify that there were no locks that
were blocking things, and I did so, and found no locks.
So I'm still at a loss why pg_basebackup is killing perf, and would
appreciate pointers on how to debug it or at least reduce its impact
on performance if that is possible.

ok, thanks. i'll take a look at that. If this turns out to be the
issue, is there some way to get pg_basebackup to run more slowly, so
that it has less impact? Or could I do this with ionice on the
pg_basebackup process?

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Jerry Sievers

2012-06-08 00:07:24 UTC

err, i included that URL but neglected to explain why. On a different
list someone suggested that I verify that there were no locks that
were blocking things, and I did so, and found no locks.
So I'm still at a loss why pg_basebackup is killing perf, and would
appreciate pointers on how to debug it or at least reduce its impact
on performance if that is possible.

You might try stopping pg_basebackup in place with SIGSTOP and check
if problem goes away. SIGCONT and you should start having
sluggishness again.

If verified, then any sort of throttling mechanism should work.

Post by Lonni J Friedman
--
http://www.postgresql.org/mailpref/pgsql-admin

--
Jerry Sievers
Postgres DBA/Development Consulting
e: ***@comcast.net
p: 732.216.7255

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Lonni J Friedman

2012-06-08 01:01:46 UTC

Post by Jerry Sievers

err, i included that URL but neglected to explain why. On a different
list someone suggested that I verify that there were no locks that
were blocking things, and I did so, and found no locks.
So I'm still at a loss why pg_basebackup is killing perf, and would
appreciate pointers on how to debug it or at least reduce its impact
on performance if that is possible.

You might try stopping pg_basebackup in place with SIGSTOP and check
if problem goes away. SIGCONT and you should start having
sluggishness again.
If verified, then any sort of throttling mechanism should work.

I'm certain that the problem is triggered only when pg_basebackup is
running. Its very predictable, and goes away as soon as pg_basebackup
finishes running. What do you mean by a throttling mechanism?

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Craig Ringer

2012-06-08 06:04:41 UTC

Post by Jerry Sievers
You might try stopping pg_basebackup in place with SIGSTOP and check
if problem goes away. SIGCONT and you should start having
sluggishness again.
If verified, then any sort of throttling mechanism should work.

Sure, it only happens when pg_basebackup is running. But if you *pause*
pg_basebackup, so it's still running but not currently doing work, does
the problem go away? Does it come back when you unpause pg_basebackup?
That's what Jerry was telling you to try.

If the problem goes away when you pause pg_basebackup and comes back
when you unpause it, it's probably a system load problem.

If it doesn't go away, it's more likely to be a locking issue or
something _other_ than simple load.

SIGSTOP ("kill -STOP") pauses a process, and SIGCONT ("kill -CONT")
resumes it, so on Linux you can use these to try and find out. When you
SIGSTOP pg_basebackup then the postgres backend associated with it
should block shortly afterwards as its buffers fill up and it can't send
more data, so the load should come off the server.

A "throttling mechanism" refers to anything that limits the rate or
speed of a thing. In this case, what you want to do if your problem is
system overload is to limit the speed at which pg_basebackup does its
work so other things can still get work done. In other words you want to
throttle it. Typical throttling mechanisms include the "ionice" and
"renice" commands to change I/O and CPU priority, respectively.

Note that you may need to change the priority of the *backend* that
pg_basebackup is using, not necessarily the pg_basebackup command its
self. I haven't done enough with Pg's replication to know how that
works, so someone else will have to fill that bit in.

--
Craig Ringer

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Lonni J Friedman

2012-06-08 19:30:48 UTC

Sure, it only happens when pg_basebackup is running. But if you *pause*
pg_basebackup, so it's still running but not currently doing work, does the
problem go away? Does it come back when you unpause pg_basebackup? That's
what Jerry was telling you to try.
If the problem goes away when you pause pg_basebackup and comes back when
you unpause it, it's probably a system load problem.
If it doesn't go away, it's more likely to be a locking issue or something
_other_ than simple load.
SIGSTOP ("kill -STOP") pauses a process, and SIGCONT ("kill -CONT") resumes
it, so on Linux you can use these to try and find out. When you SIGSTOP
pg_basebackup then the postgres backend associated with it should block
shortly afterwards as its buffers fill up and it can't send more data, so
the load should come off the server.
A "throttling mechanism" refers to anything that limits the rate or speed of
a thing. In this case, what you want to do if your problem is system
overload is to limit the speed at which pg_basebackup does its work so other
things can still get work done. In other words you want to throttle it.
Typical throttling mechanisms include the "ionice" and "renice" commands to
change I/O and CPU priority, respectively.
Note that you may need to change the priority of the *backend* that
pg_basebackup is using, not necessarily the pg_basebackup command its self.
I haven't done enough with Pg's replication to know how that works, so
someone else will have to fill that bit in.

Thanks for your reply. I've confirmed that issuing a SIGSTOP does
eliminate the thrashing, and issuing a SIGCONT resumes the thrash.

I've looked at iostat output both before & during pg_basebackup runs,
and I'm not seeing any indication that the problem is due to disk IO
bottlenecks. The numbers don't vary very much at all between the good
& bad times. This is typical when pg_basebackup is running:
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 67.76 68.62 4.42 1.46
88.34 0.00 0.00 0.00 0.00 0.00 0.00
########

and this is when the system is ok:
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 68.04 68.56 4.44 1.46
88.39 0.00 0.00 0.00 0.00 0.00 0.00
########

I looked at vmstat output, but nothing is jumping out at me as being
dramatically different when pg_basebackup is running. swap in and
swap out are zero 100% of the time for the good & bad perf cases. I
can post example output if someone is interested, or if there's
something specific that I should be looking at as a potential problem,
let me know.

thanks

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Fujii Masao

2012-06-09 02:29:36 UTC

Sure, it only happens when pg_basebackup is running. But if you *pause*
pg_basebackup, so it's still running but not currently doing work, does the
problem go away? Does it come back when you unpause pg_basebackup? That's
what Jerry was telling you to try.
If the problem goes away when you pause pg_basebackup and comes back when
you unpause it, it's probably a system load problem.
If it doesn't go away, it's more likely to be a locking issue or something
_other_ than simple load.
SIGSTOP ("kill -STOP") pauses a process, and SIGCONT ("kill -CONT") resumes
it, so on Linux you can use these to try and find out. When you SIGSTOP
pg_basebackup then the postgres backend associated with it should block
shortly afterwards as its buffers fill up and it can't send more data, so
the load should come off the server.
A "throttling mechanism" refers to anything that limits the rate or speed of
a thing. In this case, what you want to do if your problem is system
overload is to limit the speed at which pg_basebackup does its work so other
things can still get work done. In other words you want to throttle it.
Typical throttling mechanisms include the "ionice" and "renice" commands to
change I/O and CPU priority, respectively.
Note that you may need to change the priority of the *backend* that
pg_basebackup is using, not necessarily the pg_basebackup command its self.
I haven't done enough with Pg's replication to know how that works, so
someone else will have to fill that bit in.

Thanks for your reply. I've confirmed that issuing a SIGSTOP does
eliminate the thrashing, and issuing a SIGCONT resumes the thrash.
I've looked at iostat output both before & during pg_basebackup runs,
and I'm not seeing any indication that the problem is due to disk IO
bottlenecks. The numbers don't vary very much at all between the good
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 67.76 68.62 4.42 1.46
88.34 0.00 0.00 0.00 0.00 0.00 0.00
########
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 68.04 68.56 4.44 1.46
88.39 0.00 0.00 0.00 0.00 0.00 0.00
########
I looked at vmstat output, but nothing is jumping out at me as being
dramatically different when pg_basebackup is running. swap in and
swap out are zero 100% of the time for the good & bad perf cases. I
can post example output if someone is interested, or if there's
something specific that I should be looking at as a potential problem,
let me know.

Did you set synchronous_standby_names to '*'? If so, the problem you
encountered can happen.

When synchronous_standby_names is '*', you cannot control which
standbys take a role of synchronous standby. The standby which you
expect to run as asynchronous one might be synchronous one. So
my guess is that at first one of your three standbys was running as
synchronous standby, and all queries were executed normally. But
when you started pg_basebackup, pg_basebackup unexpectedly
got the role of synchronous standby from another standby. Since
pg_basebackup doesn't send the information about replication
progress back to the master, all queries (more precisely, transaction
commit) got stuck, and kept waiting for the reply from synchronous
standby.

You can avoid this problem by setting synchronous_standby_names
to the names of your standbys instead of '*'.

This seems a bug. I think we should prevent pg_basebackup from
becoming synchronous standby. Thought?

Regards,

--
Fujii Masao
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Tom Lane

2012-06-09 12:51:10 UTC

Post by Fujii Masao
This seems a bug. I think we should prevent pg_basebackup from
becoming synchronous standby. Thought?

Absolutely. If we have replication clients that are not actually
capable of being standbys, there *must* be a way for the master
to know that.

regards, tom lane

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Magnus Hagander

2012-06-10 10:43:32 UTC

Post by Fujii Masao
This seems a bug. I think we should prevent pg_basebackup from
becoming synchronous standby. Thought?

Absolutely. If we have replication clients that are not actually
capable of being standbys, there *must* be a way for the master
to know that.

I thought we fixed this already by sending InvalidXlogRecPtr as flush
location? And that this only applied in 9.2?

Are you saying we picked pg_basebackup *in backup mode* (not log
streaming) as synchronous standby? If so then yes, that is
*definitely* a bug that should be fixed. We should never select a
connection that's not even streaming log as standby!

Fujii Masao

2012-06-10 12:25:38 UTC

Post by Fujii Masao
This seems a bug. I think we should prevent pg_basebackup from
becoming synchronous standby. Thought?

Absolutely. If we have replication clients that are not actually
capable of being standbys, there *must* be a way for the master
to know that.

Yes.

Post by Magnus Hagander
If so then yes, that is
*definitely* a bug that should be fixed. We should never select a
connection that's not even streaming log as standby!

Agreed. Attached patch prevents pg_basebackup from becoming sync
standby. Also this patch fixes another problem: currently only walsender
which reaches STREAMING state can become sync walsender. OTOH,
sync walsender thinks that walsender with higher priority will be sync one
whether its state is STREAMING, and switches to potential sync walsender.
So when the standby with higher priority connects to the master, we
might have no sync standby until it reaches the STREAMING state.
To fix this problem, the patch switches walsender's state from sync to
potential *after* walsender with higher priority has reached the
STREAMING state.

We also should not select (1) background stream process forked from
pg_basebackup and (2) pg_receivexlog as sync standby because they
don't send back replication progress. To address this, I'm thinking to
introduce new option "NOSYNC" in "START_REPLICATION" command
as follows, and to change (1) and (2) so that they specify NOSYNC.

START_REPLICATION XXX/XXX [NOSYNC]

If the standby specifies NOSYNC option, it's never assigned as sync
standby even if its name is in synchronous_standby_names. Thought?

BTW, we are discussing about changing pg_receivexlog so that it sends
back replication progress, in another thread. So if this change will have
been applied, probably we don't need to change pg_receivexlog so that
it uses NOSYNC option.

Regards,

--
Fujii Masao

Fujii Masao

2012-06-10 13:34:06 UTC

Post by Fujii Masao
This seems a bug. I think we should prevent pg_basebackup from
becoming synchronous standby. Thought?

Absolutely. If we have replication clients that are not actually
capable of being standbys, there *must* be a way for the master
to know that.

Yes.

Post by Magnus Hagander
If so then yes, that is
*definitely* a bug that should be fixed. We should never select a
connection that's not even streaming log as standby!

Agreed. Attached patch prevents pg_basebackup from becoming sync
standby. Also this patch fixes another problem: currently only walsender
which reaches STREAMING state can become sync walsender. OTOH,
sync walsender thinks that walsender with higher priority will be sync one
whether its state is STREAMING, and switches to potential sync walsender.
So when the standby with higher priority connects to the master, we
might have no sync standby until it reaches the STREAMING state.
To fix this problem, the patch switches walsender's state from sync to
potential *after* walsender with higher priority has reached the
STREAMING state.
We also should not select (1) background stream process forked from
pg_basebackup and (2) pg_receivexlog as sync standby because they
don't send back replication progress. To address this, I'm thinking to
introduce new option "NOSYNC" in "START_REPLICATION" command
as follows, and to change (1) and (2) so that they specify NOSYNC.
START_REPLICATION XXX/XXX [NOSYNC]
If the standby specifies NOSYNC option, it's never assigned as sync
standby even if its name is in synchronous_standby_names. Thought?

The standby which always sends InvalidXLogRecPtr back should not
become sync one. So instead of NOSYNC option, by checking whether
InvalidXLogRecPtr is sent, we can avoid problematic sync standby.

Regards,

--
Fujii Masao
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Fujii Masao

2012-06-10 14:08:58 UTC

Post by Fujii Masao
This seems a bug. I think we should prevent pg_basebackup from
becoming synchronous standby. Thought?

Absolutely. If we have replication clients that are not actually
capable of being standbys, there *must* be a way for the master
to know that.

Yes.

Post by Magnus Hagander
If so then yes, that is
*definitely* a bug that should be fixed. We should never select a
connection that's not even streaming log as standby!

Agreed. Attached patch prevents pg_basebackup from becoming sync
standby. Also this patch fixes another problem: currently only walsender
which reaches STREAMING state can become sync walsender. OTOH,
sync walsender thinks that walsender with higher priority will be sync one
whether its state is STREAMING, and switches to potential sync walsender.
So when the standby with higher priority connects to the master, we
might have no sync standby until it reaches the STREAMING state.
To fix this problem, the patch switches walsender's state from sync to
potential *after* walsender with higher priority has reached the
STREAMING state.
We also should not select (1) background stream process forked from
pg_basebackup and (2) pg_receivexlog as sync standby because they
don't send back replication progress. To address this, I'm thinking to
introduce new option "NOSYNC" in "START_REPLICATION" command
as follows, and to change (1) and (2) so that they specify NOSYNC.
START_REPLICATION XXX/XXX [NOSYNC]
If the standby specifies NOSYNC option, it's never assigned as sync
standby even if its name is in synchronous_standby_names. Thought?

The standby which always sends InvalidXLogRecPtr back should not
become sync one. So instead of NOSYNC option, by checking whether
InvalidXLogRecPtr is sent, we can avoid problematic sync standby.

We should not do this because Magnus is proposing the patch
(http://archives.postgresql.org/pgsql-hackers/2012-06/msg00348.php)
which breaks the above assumption at all. So we should introduce
something like NOSYNC option.

Regards,

--
Fujii Masao
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Magnus Hagander

2012-06-10 14:10:46 UTC

Post by Fujii Masao
This seems a bug. I think we should prevent pg_basebackup from
becoming synchronous standby. Thought?

Absolutely. If we have replication clients that are not actually
capable of being standbys, there *must* be a way for the master
to know that.

Yes.

Post by Magnus Hagander
If so then yes, that is
*definitely* a bug that should be fixed. We should never select a
connection that's not even streaming log as standby!

Agreed. Attached patch prevents pg_basebackup from becoming sync
standby. Also this patch fixes another problem: currently only walsender
which reaches STREAMING state can become sync walsender. OTOH,
sync walsender thinks that walsender with higher priority will be sync one
whether its state is STREAMING, and switches to potential sync walsender.
So when the standby with higher priority connects to the master, we
might have no sync standby until it reaches the STREAMING state.
To fix this problem, the patch switches walsender's state from sync to
potential *after* walsender with higher priority has reached the
STREAMING state.
We also should not select (1) background stream process forked from
pg_basebackup and (2) pg_receivexlog as sync standby because they
don't send back replication progress. To address this, I'm thinking to
introduce new option "NOSYNC" in "START_REPLICATION" command
as follows, and to change (1) and (2) so that they specify NOSYNC.
START_REPLICATION XXX/XXX [NOSYNC]
If the standby specifies NOSYNC option, it's never assigned as sync
standby even if its name is in synchronous_standby_names. Thought?

The standby which always sends InvalidXLogRecPtr back should not
become sync one. So instead of NOSYNC option, by checking whether
InvalidXLogRecPtr is sent, we can avoid problematic sync standby.

Wouldn't the better choice there in that case be to give a switch to
pg_receivexlog if you *want* it to be able to become a sync replica,
and by default disallow it? And then keep the backend just treating
InvalidXlogRecPtr as don't-become-sync-replica.

Fujii Masao

2012-06-10 14:29:47 UTC

Post by Fujii Masao
This seems a bug. I think we should prevent pg_basebackup from
becoming synchronous standby. Thought?

Absolutely. If we have replication clients that are not actually
capable of being standbys, there *must* be a way for the master
to know that.

Yes.

Post by Magnus Hagander
If so then yes, that is
*definitely* a bug that should be fixed. We should never select a
connection that's not even streaming log as standby!

Agreed. Attached patch prevents pg_basebackup from becoming sync
standby. Also this patch fixes another problem: currently only walsender
which reaches STREAMING state can become sync walsender. OTOH,
sync walsender thinks that walsender with higher priority will be sync one
whether its state is STREAMING, and switches to potential sync walsender.
So when the standby with higher priority connects to the master, we
might have no sync standby until it reaches the STREAMING state.
To fix this problem, the patch switches walsender's state from sync to
potential *after* walsender with higher priority has reached the
STREAMING state.
We also should not select (1) background stream process forked from
pg_basebackup and (2) pg_receivexlog as sync standby because they
don't send back replication progress. To address this, I'm thinking to
introduce new option "NOSYNC" in "START_REPLICATION" command
as follows, and to change (1) and (2) so that they specify NOSYNC.
START_REPLICATION XXX/XXX [NOSYNC]
If the standby specifies NOSYNC option, it's never assigned as sync
standby even if its name is in synchronous_standby_names. Thought?

The standby which always sends InvalidXLogRecPtr back should not
become sync one. So instead of NOSYNC option, by checking whether
InvalidXLogRecPtr is sent, we can avoid problematic sync standby.

I don't object to making pg_receivexlog as sync standby at all. So at least
for me, that switch is not necessary. What I'm worried about is the
background stream process forked from pg_basebackup. I think that
it should not run as sync standby but sending back its replication progress
seems helpful because a user can see the progress from pg_stat_replication.
So I'm thinking that something like NOSYNC option is required.

Regards,

--
Fujii Masao
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Magnus Hagander

2012-06-11 13:19:12 UTC

Post by Fujii Masao
This seems a bug. I think we should prevent pg_basebackup from
becoming synchronous standby. Thought?

Absolutely. If we have replication clients that are not actually
capable of being standbys, there *must* be a way for the master
to know that.

Yes.

Post by Magnus Hagander
If so then yes, that is
*definitely* a bug that should be fixed. We should never select a
connection that's not even streaming log as standby!

This fix needs to be applied independently of the other discussions,
since it affects 9.1 and needs to be backpatched.

So - applied, and backpatched.

The issues wrt the pg_basebackup background process and pg_receivexlog
are only for 9.2...

Lonni J Friedman

2012-06-11 17:37:41 UTC

Sure, it only happens when pg_basebackup is running. But if you *pause*
pg_basebackup, so it's still running but not currently doing work, does the
problem go away? Does it come back when you unpause pg_basebackup? That's
what Jerry was telling you to try.
If the problem goes away when you pause pg_basebackup and comes back when
you unpause it, it's probably a system load problem.
If it doesn't go away, it's more likely to be a locking issue or something
_other_ than simple load.
SIGSTOP ("kill -STOP") pauses a process, and SIGCONT ("kill -CONT") resumes
it, so on Linux you can use these to try and find out. When you SIGSTOP
pg_basebackup then the postgres backend associated with it should block
shortly afterwards as its buffers fill up and it can't send more data, so
the load should come off the server.
A "throttling mechanism" refers to anything that limits the rate or speed of
a thing. In this case, what you want to do if your problem is system
overload is to limit the speed at which pg_basebackup does its work so other
things can still get work done. In other words you want to throttle it.
Typical throttling mechanisms include the "ionice" and "renice" commands to
change I/O and CPU priority, respectively.
Note that you may need to change the priority of the *backend* that
pg_basebackup is using, not necessarily the pg_basebackup command its self.
I haven't done enough with Pg's replication to know how that works, so
someone else will have to fill that bit in.

Thanks for your reply. I've confirmed that issuing a SIGSTOP does
eliminate the thrashing, and issuing a SIGCONT resumes the thrash.
I've looked at iostat output both before & during pg_basebackup runs,
and I'm not seeing any indication that the problem is due to disk IO
bottlenecks. The numbers don't vary very much at all between the good
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 67.76 68.62 4.42 1.46
88.34 0.00 0.00 0.00 0.00 0.00 0.00
########
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 68.04 68.56 4.44 1.46
88.39 0.00 0.00 0.00 0.00 0.00 0.00
########
I looked at vmstat output, but nothing is jumping out at me as being
dramatically different when pg_basebackup is running. swap in and
swap out are zero 100% of the time for the good & bad perf cases. I
can post example output if someone is interested, or if there's
something specific that I should be looking at as a potential problem,
let me know.

Did you set synchronous_standby_names to '*'? If so, the problem you
encountered can happen.
When synchronous_standby_names is '*', you cannot control which
standbys take a role of synchronous standby. The standby which you
expect to run as asynchronous one might be synchronous one. So
my guess is that at first one of your three standbys was running as
synchronous standby, and all queries were executed normally. But
when you started pg_basebackup, pg_basebackup unexpectedly
got the role of synchronous standby from another standby. Since
pg_basebackup doesn't send the information about replication
progress back to the master, all queries (more precisely, transaction
commit) got stuck, and kept waiting for the reply from synchronous
standby.
You can avoid this problem by setting synchronous_standby_names
to the names of your standbys instead of '*'.

I don't have synchronous_standby_names set at all. I'm only doing
asynchronous replication.

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Fujii Masao

2012-06-12 17:49:11 UTC

Sure, it only happens when pg_basebackup is running. But if you *pause*
pg_basebackup, so it's still running but not currently doing work, does the
problem go away? Does it come back when you unpause pg_basebackup? That's
what Jerry was telling you to try.
If the problem goes away when you pause pg_basebackup and comes back when
you unpause it, it's probably a system load problem.
If it doesn't go away, it's more likely to be a locking issue or something
_other_ than simple load.
SIGSTOP ("kill -STOP") pauses a process, and SIGCONT ("kill -CONT") resumes
it, so on Linux you can use these to try and find out. When you SIGSTOP
pg_basebackup then the postgres backend associated with it should block
shortly afterwards as its buffers fill up and it can't send more data, so
the load should come off the server.
A "throttling mechanism" refers to anything that limits the rate or speed of
a thing. In this case, what you want to do if your problem is system
overload is to limit the speed at which pg_basebackup does its work so other
things can still get work done. In other words you want to throttle it.
Typical throttling mechanisms include the "ionice" and "renice" commands to
change I/O and CPU priority, respectively.
Note that you may need to change the priority of the *backend* that
pg_basebackup is using, not necessarily the pg_basebackup command its self.
I haven't done enough with Pg's replication to know how that works, so
someone else will have to fill that bit in.

Thanks for your reply. I've confirmed that issuing a SIGSTOP does
eliminate the thrashing, and issuing a SIGCONT resumes the thrash.
I've looked at iostat output both before & during pg_basebackup runs,
and I'm not seeing any indication that the problem is due to disk IO
bottlenecks. The numbers don't vary very much at all between the good
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 67.76 68.62 4.42 1.46
88.34 0.00 0.00 0.00 0.00 0.00 0.00
########
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 68.04 68.56 4.44 1.46
88.39 0.00 0.00 0.00 0.00 0.00 0.00
########
I looked at vmstat output, but nothing is jumping out at me as being
dramatically different when pg_basebackup is running. swap in and
swap out are zero 100% of the time for the good & bad perf cases. I
can post example output if someone is interested, or if there's
something specific that I should be looking at as a potential problem,
let me know.

Did you set synchronous_standby_names to '*'? If so, the problem you
encountered can happen.
When synchronous_standby_names is '*', you cannot control which
standbys take a role of synchronous standby. The standby which you
expect to run as asynchronous one might be synchronous one. So
my guess is that at first one of your three standbys was running as
synchronous standby, and all queries were executed normally. But
when you started pg_basebackup, pg_basebackup unexpectedly
got the role of synchronous standby from another standby. Since
pg_basebackup doesn't send the information about replication
progress back to the master, all queries (more precisely, transaction
commit) got stuck, and kept waiting for the reply from synchronous
standby.
You can avoid this problem by setting synchronous_standby_names
to the names of your standbys instead of '*'.

I don't have synchronous_standby_names set at all. I'm only doing
asynchronous replication.

Hmm... I have no idea about what happened on your environment, for now.
Could you show me the self-contained test case?

Regards,

--
Fujii Masao
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Lonni J Friedman

2012-06-12 18:37:42 UTC

Sure, it only happens when pg_basebackup is running. But if you *pause*
pg_basebackup, so it's still running but not currently doing work, does the
problem go away? Does it come back when you unpause pg_basebackup? That's
what Jerry was telling you to try.
If the problem goes away when you pause pg_basebackup and comes back when
you unpause it, it's probably a system load problem.
If it doesn't go away, it's more likely to be a locking issue or something
_other_ than simple load.
SIGSTOP ("kill -STOP") pauses a process, and SIGCONT ("kill -CONT") resumes
it, so on Linux you can use these to try and find out. When you SIGSTOP
pg_basebackup then the postgres backend associated with it should block
shortly afterwards as its buffers fill up and it can't send more data, so
the load should come off the server.
A "throttling mechanism" refers to anything that limits the rate or speed of
a thing. In this case, what you want to do if your problem is system
overload is to limit the speed at which pg_basebackup does its work so other
things can still get work done. In other words you want to throttle it.
Typical throttling mechanisms include the "ionice" and "renice" commands to
change I/O and CPU priority, respectively.
Note that you may need to change the priority of the *backend* that
pg_basebackup is using, not necessarily the pg_basebackup command its self.
I haven't done enough with Pg's replication to know how that works, so
someone else will have to fill that bit in.

Thanks for your reply. I've confirmed that issuing a SIGSTOP does
eliminate the thrashing, and issuing a SIGCONT resumes the thrash.
I've looked at iostat output both before & during pg_basebackup runs,
and I'm not seeing any indication that the problem is due to disk IO
bottlenecks. The numbers don't vary very much at all between the good
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 67.76 68.62 4.42 1.46
88.34 0.00 0.00 0.00 0.00 0.00 0.00
########
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 68.04 68.56 4.44 1.46
88.39 0.00 0.00 0.00 0.00 0.00 0.00
########
I looked at vmstat output, but nothing is jumping out at me as being
dramatically different when pg_basebackup is running. swap in and
swap out are zero 100% of the time for the good & bad perf cases. I
can post example output if someone is interested, or if there's
something specific that I should be looking at as a potential problem,
let me know.

Did you set synchronous_standby_names to '*'? If so, the problem you
encountered can happen.
When synchronous_standby_names is '*', you cannot control which
standbys take a role of synchronous standby. The standby which you
expect to run as asynchronous one might be synchronous one. So
my guess is that at first one of your three standbys was running as
synchronous standby, and all queries were executed normally. But
when you started pg_basebackup, pg_basebackup unexpectedly
got the role of synchronous standby from another standby. Since
pg_basebackup doesn't send the information about replication
progress back to the master, all queries (more precisely, transaction
commit) got stuck, and kept waiting for the reply from synchronous
standby.
You can avoid this problem by setting synchronous_standby_names
to the names of your standbys instead of '*'.

I don't have synchronous_standby_names set at all. I'm only doing
asynchronous replication.

Hmm... I have no idea about what happened on your environment, for now.
Could you show me the self-contained test case?

I'm running the following, which gets piped over ssh to a remote
server (at gigabit ethernet speed):
pg_basebackup -v -D - -x -Ft -U postgres

One thing that I've discovered is that if I throttle back the speed of
what is getting piped to the remote server, that directly correlates
to the load on the server.

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Magnus Hagander

2012-06-12 18:39:23 UTC

Sure, it only happens when pg_basebackup is running. But if you *pause*
pg_basebackup, so it's still running but not currently doing work, does the
problem go away? Does it come back when you unpause pg_basebackup? That's
what Jerry was telling you to try.
If the problem goes away when you pause pg_basebackup and comes back when
you unpause it, it's probably a system load problem.
If it doesn't go away, it's more likely to be a locking issue or something
_other_ than simple load.
SIGSTOP ("kill -STOP") pauses a process, and SIGCONT ("kill -CONT") resumes
it, so on Linux you can use these to try and find out. When you SIGSTOP
pg_basebackup then the postgres backend associated with it should block
shortly afterwards as its buffers fill up and it can't send more data, so
the load should come off the server.
A "throttling mechanism" refers to anything that limits the rate or speed of
a thing. In this case, what you want to do if your problem is system
overload is to limit the speed at which pg_basebackup does its work so other
things can still get work done. In other words you want to throttle it.
Typical throttling mechanisms include the "ionice" and "renice" commands to
change I/O and CPU priority, respectively.
Note that you may need to change the priority of the *backend* that
pg_basebackup is using, not necessarily the pg_basebackup command its self.
I haven't done enough with Pg's replication to know how that works, so
someone else will have to fill that bit in.

Thanks for your reply. I've confirmed that issuing a SIGSTOP does
eliminate the thrashing, and issuing a SIGCONT resumes the thrash.
I've looked at iostat output both before & during pg_basebackup runs,
and I'm not seeing any indication that the problem is due to disk IO
bottlenecks. The numbers don't vary very much at all between the good
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 67.76 68.62 4.42 1.46
88.34 0.00 0.00 0.00 0.00 0.00 0.00
########
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 68.04 68.56 4.44 1.46
88.39 0.00 0.00 0.00 0.00 0.00 0.00
########
I looked at vmstat output, but nothing is jumping out at me as being
dramatically different when pg_basebackup is running. swap in and
swap out are zero 100% of the time for the good & bad perf cases. I
can post example output if someone is interested, or if there's
something specific that I should be looking at as a potential problem,
let me know.

Did you set synchronous_standby_names to '*'? If so, the problem you
encountered can happen.
When synchronous_standby_names is '*', you cannot control which
standbys take a role of synchronous standby. The standby which you
expect to run as asynchronous one might be synchronous one. So
my guess is that at first one of your three standbys was running as
synchronous standby, and all queries were executed normally. But
when you started pg_basebackup, pg_basebackup unexpectedly
got the role of synchronous standby from another standby. Since
pg_basebackup doesn't send the information about replication
progress back to the master, all queries (more precisely, transaction
commit) got stuck, and kept waiting for the reply from synchronous
standby.
You can avoid this problem by setting synchronous_standby_names
to the names of your standbys instead of '*'.

I don't have synchronous_standby_names set at all. I'm only doing
asynchronous replication.

Hmm... I have no idea about what happened on your environment, for now.
Could you show me the self-contained test case?

I'm running the following, which gets piped over ssh to a remote
pg_basebackup -v -D - -x -Ft -U postgres
One thing that I've discovered is that if I throttle back the speed of
what is getting piped to the remote server, that directly correlates
to the load on the server.

That seems to indicate that you're overloading the I/O system... Or
the CPU, but more likely I/O.

Simon Riggs

2012-06-20 12:02:57 UTC

Post by Lonni J Friedman
I'm running the following, which gets piped over ssh to a remote
pg_basebackup -v -D - -x -Ft -U postgres
One thing that I've discovered is that if I throttle back the speed of
what is getting piped to the remote server, that directly correlates
to the load on the server.

That seems to indicate that you're overloading the I/O system... Or
the CPU, but more likely I/O.

CPU utilisation of ssl connections is bad. If network bandwidth is
good, perhaps running WALSender at full speed with encryption can tank
the server.

An effect related to cacheing of WAL files? Perhaps we need to mark
them as FADV_DONTNEED at some point.

Hard to say without detailed analysis.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Lonni J Friedman

2012-06-20 13:27:17 UTC

Post by Simon Riggs

That seems to indicate that you're overloading the I/O system... Or
the CPU, but more likely I/O.

CPU utilisation of ssl connections is bad. If network bandwidth is
good, perhaps running WALSender at full speed with encryption can tank
the server.

I'm not using SSL.

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Welty, Richard

2012-06-20 13:56:38 UTC

Post by Lonni J Friedman
I'm not using SSL.

ummm, ssh uses ssl.

richard

Scott Marlowe

2012-06-09 06:53:35 UTC

Post by Lonni J Friedman
I've looked at iostat output both before & during pg_basebackup runs,
and I'm not seeing any indication that the problem is due to disk IO
bottlenecks. The numbers don't vary very much at all between the good
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 67.76 68.62 4.42 1.46
88.34 0.00 0.00 0.00 0.00 0.00 0.00
########
########
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
md0
0.00 0.00 68.04 68.56 4.44 1.46
88.39 0.00 0.00 0.00 0.00 0.00 0.00
########

Two points. 1: md0 don't show things like %util, only the physical
drives will have that output, which is what you want to see, if it's
hopping up to 100%. 2: you need to run it with a number and get
something AFTER the first line, which is the average since the machine
was first turned on.

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Magnus Hagander

2012-06-12 14:27:43 UTC