Discussion:
[ADMIN] Establishing remote connections is slow
(too old to reply)
Mindaugas Žakšauskas
2012-01-17 12:44:26 UTC
Permalink
Hi,

I have a very weird problem related to establishing remote connections
to PostgreSQL server and hopefully someone can give me some hints how
can I debug this.

The essence is that establishing remote connection takes anywhere from
10 to 30 seconds. Once connected, the queries are fast - it's just
establishing new connection that takes ages. This problem is not
applicable to establishing local connections: running psql command
from the local machine takes no time to connect, same applies if a
client connects to the PostgreSQL via ssh tunnel.

Immediately after restarting PostgreSQL daemon, the problem
temporarily goes away but later resurfaces again.

Things we have tried:
- doing all sorts of DNS queries against the connecting client IP --
seems to be fine, DNS resolution takes no time;

- enabling debug for HA
(http://docs.oracle.com/cd/E19680-01/html/821-1534/fumuy.html#scrolltoc)
-- debugging showed no problems. We were probing for the problem
described in by
http://blogs.oracle.com/js/entry/the_nscd_does_not_cache

- asking PostgreSQL to listen not only on multipath IP (used for
failover), but also on an ethernet interface. This is the most
interesting. When remote connection via multipath IP is slow to
establish, establishing remote connections via ethernet interface is
still snappy.

At this point it is reasonable to think that the problem lies
somewhere in the networking (multipath IP), and that well might be
true. But we tried running simple netcat server-client and it was all
instant via both interfaces (multipath and eth).

Can anyone suggest any ideas how to debug this further? Many thanks in advance.

Environment:
- Solaris 5.10 / Intel
- Sun cluster
- HA for PostgreSQL
(http://docs.oracle.com/cd/E19680-01/html/821-1534/cacjgdbc.html#scrolltoc)
- PostgreSQL server version: 9.0.4

Regards,
Mindaugas
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Tom Lane
2012-01-17 15:09:42 UTC
Permalink
Post by Mindaugas Žakšauskas
I have a very weird problem related to establishing remote connections
to PostgreSQL server and hopefully someone can give me some hints how
can I debug this.
The essence is that establishing remote connection takes anywhere from
10 to 30 seconds. Once connected, the queries are fast - it's just
establishing new connection that takes ages.
Perhaps the problem is related to authentication - what auth mode
are you using, and can you experiment with some other ones?

What I'd do to start debugging this is to get out a packet sniffer
(wireshark or some such) and just observe the timings of packets sent
and received by Postgres. This would at least give you a hint which
step is the bottleneck.
Post by Mindaugas Žakšauskas
This problem is not
applicable to establishing local connections: running psql command
from the local machine takes no time to connect, same applies if a
client connects to the PostgreSQL via ssh tunnel.
What about "psql -h localhost", ie physically local connection but
via TCP not unix socket?

regards, tom lane
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Mindaugas Žakšauskas
2012-01-17 15:31:23 UTC
Permalink
Hi Tom,

Thanks for your reply.
Post by Tom Lane
Perhaps the problem is related to authentication - what auth mode
are you using, and can you experiment with some other ones?
Excerpt from my pg_hba.conf

------------
local all all trust
host all all IP1/mask1 md5
host all all IP2/mask2 md5
------------

The IP/mask combinations are corresponding to the IP/subnet client is
connecting from.

Can you elaborate a bit on "experimenting"? Because I am not quite
sure what changes could possibly make any difference. Also, the fact
that when remotely connecting via standard ethernet IP address (rather
than multipath) works fine as well as this working fine short after
PostgreSQL restart, I can't see how this could be relevant.
Post by Tom Lane
What I'd do to start debugging this is to get out a packet sniffer
(wireshark or some such) and just observe the timings of packets sent
and received by Postgres.  This would at least give you a hint which
step is the bottleneck.
I have done some truss (strace alternative for Solaris) debugging and
it looks like it just waits for the server side to respond. I can
probably dig out where and when exactly is it waiting, but me knowing
very little about PostgreSQL internals won't help much.

Wireshark is probably not an option as this all happens on a live
server which is connected directly to a switch. I might have a look if
a tcpdump is available but chances are very limited.
Post by Tom Lane
What about "psql -h localhost", ie physically local connection but
via TCP not unix socket?
***@dbserver> psql -h 127.0.0.1 -p5432 -U user -W db

This works fast. But

***@dbserver> psql -h <IP> -p5432 -U user -W db

(where <IP> is the multipath interface)

This is slow! So it is definitely something network-related or
something how PostgreSQL deals with multipath interface.

Regards,
Mindaugas
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Tom Lane
2012-01-17 15:41:12 UTC
Permalink
Post by Mindaugas Žakšauskas
Post by Tom Lane
What about "psql -h localhost", ie physically local connection but
via TCP not unix socket?
This works fast. But
(where <IP> is the multipath interface)
This is slow! So it is definitely something network-related or
something how PostgreSQL deals with multipath interface.
Hm. AFAIR postgres doesn't know anything particular about multipath
interfaces --- it just listens where you tell it to. So I'm thinking
this is a system-level issue. It still seems like it could be DNS
lookup related though. Do you have log_hostname turned on, and if so
does turning it off make a difference?

regards, tom lane
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Mindaugas Žakšauskas
2012-01-17 15:53:14 UTC
Permalink
Hm.  AFAIR postgres doesn't know anything particular about multipath
interfaces --- it just listens where you tell it to.
I was thinking the same, but PostgreSQL is the "first line to contact"
and I somehow need to obtain a proof that this is indeed a
system-level issue. My simple netcat experiments seem to suggest the
opposite.
 So I'm thinking this is a system-level issue.  It still seems like it could be DNS
lookup related though.  Do you have log_hostname turned on, and if so
does turning it off make a difference?
log_hostname is turned off.

Thanks for your help!

Regards,
Mindaugas
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Kevin Grittner
2012-01-17 16:46:23 UTC
Permalink
Post by Mindaugas Žakšauskas
The essence is that establishing remote connection takes anywhere
from 10 to 30 seconds. Once connected, the queries are fast
The only time I've seen something similar, there was no reverse DNS
entry to go from IP address to host name. Adding that corrected the
issue. I would try that.

If that fixes it, the questions would be whether PostgreSQL is doing
an unnecessary reverse DNS lookup.

-Kevin
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Tom Lane
2012-01-17 17:18:37 UTC
Permalink
Post by Kevin Grittner
Post by Mindaugas Žakšauskas
The essence is that establishing remote connection takes anywhere
from 10 to 30 seconds. Once connected, the queries are fast
The only time I've seen something similar, there was no reverse DNS
entry to go from IP address to host name. Adding that corrected the
issue. I would try that.
If that fixes it, the questions would be whether PostgreSQL is doing
an unnecessary reverse DNS lookup.
Having log_hostname off is supposed to prevent us from attempting a
reverse DNS lookup ... but it would be worth checking into whether one
is happening anyway. (I would think though that such activity would
be visible in strace/truss output. Perhaps you should turn log_hostname
*on* and verify that you see the lookup activity in strace that wasn't
there before.)

regards, tom lane
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Kevin Grittner
2012-01-17 17:36:40 UTC
Permalink
Post by Tom Lane
Post by Kevin Grittner
Post by Mindaugas Žakšauskas
The essence is that establishing remote connection takes
anywhere from 10 to 30 seconds. Once connected, the queries are
fast
The only time I've seen something similar, there was no reverse
DNS entry to go from IP address to host name. Adding that
corrected the issue. I would try that.
If that fixes it, the questions would be whether PostgreSQL is
doing an unnecessary reverse DNS lookup.
Having log_hostname off is supposed to prevent us from attempting
a reverse DNS lookup ... but it would be worth checking into
whether one is happening anyway. (I would think though that such
activity would be visible in strace/truss output. Perhaps you
should turn log_hostname *on* and verify that you see the lookup
activity in strace that wasn't there before.)
Actually, where I've seen this sort of problem, it was the client
code which was doing the unnecessary reverse DNS lookup. What
controls this in psql?

-Kevin
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Tom Lane
2012-01-17 19:23:48 UTC
Permalink
Post by Kevin Grittner
Actually, where I've seen this sort of problem, it was the client
code which was doing the unnecessary reverse DNS lookup. What
controls this in psql?
psql? AFAIR psql itself doesn't do any such thing.

It's possible that certain libraries such as SSL or Kerberos might
do an RDNS lookup internally, though. The OP showed he was using
md5 (password) authentication, so we can discount authentication
libraries, but I wonder whether openssl ever does DNS lookups,
and if so how to control that. Mindaugas, are you using SSL,
and if so can you turn it off and see whether things change?
(It should be safe to do so at least on the "localhost" connection,
even if you feel your network is insecure.)

regards, tom lane
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Mindaugas Žakšauskas
2012-01-17 20:29:18 UTC
Permalink
<..>  Mindaugas, are you using SSL,
and if so can you turn it off and see whether things change?
(It should be safe to do so at least on the "localhost" connection,
even if you feel your network is insecure.)
No, I am not using SSL; it is either disabled or the default setting
is off anyway. This was one of the first things I have checked.
Moreover, this would probably make it hard to explain why does it take
no time to establish connections immediately after PostgreSQL restart
and why it does it degrade later.

To respond to previous emails - we tried doing DNS lookups against the
client host and they took no time.

Thanks for your ideas.

Regards,
Mindaugas
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Craig James
2012-01-17 21:04:46 UTC
Permalink
Post by Mindaugas Žakšauskas
<..> Mindaugas, are you using SSL,
and if so can you turn it off and see whether things change?
(It should be safe to do so at least on the "localhost" connection,
even if you feel your network is insecure.)
No, I am not using SSL; it is either disabled or the default setting
is off anyway. This was one of the first things I have checked.
Moreover, this would probably make it hard to explain why does it take
no time to establish connections immediately after PostgreSQL restart
and why it does it degrade later.
To respond to previous emails - we tried doing DNS lookups against the
client host and they took no time.
Try putting the hostnames and IP addresses in /etc/hosts ... first on the
server (for the client) and then on the client (for the server).

Craig
Post by Mindaugas Žakšauskas
Thanks for your ideas.
Regards,
Mindaugas
--
http://www.postgresql.org/mailpref/pgsql-admin
Loading...