Discussion:
[ADMIN] Interpreting pg_stat_replication values
(too old to reply)
Alexander Fortin
2012-01-16 09:25:37 UTC
Permalink
Hi folks,

We've got to monitor replication status (and lag) of a few 9.1.2
replicas and I'm trying to use values in master's pg_stat_replication to
do so.

The documentation [1] on postgresql.org doesn't explains exactly the
internals, especially what the *_location values means. What I'm doing
right now is to convert replay_location from 64 bit hex to integer and
trace the growing trend, and looking at that I think it's quite
impossible to be a WAL filename because the growing rate is way too fast
(average 100k per second, with spikes of 8G per second).

Anyway, the whole idea is to check "distance" between the actual value
for the master (still have to figure out where is that) and the replicas
replay_location, and alert if that value is too high compared to our
average trends. If you've got better ideas to spare, don't hesitate ;)

Thanks for your time


[1]
http://www.postgresql.org/docs/9.1/static/monitoring-stats.html#MONITORING-STATS-VIEWS
--
Alexander Fortin
http://about.me/alexanderfortin/
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Sergey Konoplev
2012-01-18 08:38:16 UTC
Permalink
Hi,

On Mon, Jan 16, 2012 at 1:25 PM, Alexander Fortin
Anyway, the whole idea is to check "distance" between the actual value for
the master (still have to figure out where is that) and the replicas
replay_location, and alert if that value is too high compared to our average
trends. If you've got better ideas to spare, don't hesitate ;)
With 9.1 you can use

SELECT now() - pg_last_xact_replay_timestamp() AS time_lag;

to measure a time lag on replica, and the following solution to get a
byte lag for each replica on master:

CREATE OR REPLACE FUNCTION hex_to_int(i_hex text, OUT o_dec integer)
RETURNS integer LANGUAGE 'plpgsql' IMMUTABLE STRICT AS $$
BEGIN
EXECUTE 'SELECT x''' || i_hex || '''::integer' INTO o_dec;
RETURN;
END $$;

SELECT
client_addr,
sent_offset - (
replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS byte_lag
FROM (
SELECT
client_addr,
hex_to_int(split_part(sent_location, '/', 1)) AS sent_xlog,
hex_to_int(split_part(replay_location, '/', 1)) AS replay_xlog,
hex_to_int(split_part(sent_location, '/', 2)) AS sent_offset,
hex_to_int(split_part(replay_location, '/', 2)) AS replay_offset
FROM pg_stat_replication
) AS s;
Thanks for your time
[1]
http://www.postgresql.org/docs/9.1/static/monitoring-stats.html#MONITORING-STATS-VIEWS
--
Alexander Fortin
http://about.me/alexanderfortin/
--
http://www.postgresql.org/mailpref/pgsql-admin
--
Sergey Konoplev

Blog: http://gray-hemp.blogspot.com
LinkedIn: http://ru.linkedin.com/in/grayhemp
JID/GTalk: ***@gmail.com Skype: gray-hemp
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Alexander Fortin
2012-01-19 15:04:30 UTC
Permalink
Post by Sergey Konoplev
With 9.1 you can use
SELECT now() - pg_last_xact_replay_timestamp() AS time_lag;
Hi Sergey,
thank you very much for your suggestion. In the meanwhile we went for a
check on pg_stat_replication, but we'll add one as you suggest soon on
the replicas.

By the way, maybe you know where I could find documentation about
possible values from column "state" on view pg_stat_replication? I can
see "streaming" when everything is ok, and once I could get just a
"catchup" while a replica was synching.
--
Alexander Fortin
http://about.me/alexanderfortin/
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Sergey Konoplev
2012-01-20 08:13:05 UTC
Permalink
On Thu, Jan 19, 2012 at 7:04 PM, Alexander Fortin
Post by Alexander Fortin
thank you very much for your suggestion. In the meanwhile we went for a
check on pg_stat_replication, but we'll add one as you suggest soon on the
replicas.
You are welcome.
Post by Alexander Fortin
By the way, maybe you know where I could find documentation about possible
values from column "state" on view pg_stat_replication? I can see
"streaming" when everything is ok, and once I could get just a "catchup"
while a replica was synching.
Here it is

http://doxygen.postgresql.org/walsender__private_8h.html#a177d2f2a29adbee44a22f1e14f78d061
Post by Alexander Fortin
--
Alexander Fortin
http://about.me/alexanderfortin/
--
Sergey Konoplev

Blog: http://gray-hemp.blogspot.com
LinkedIn: http://ru.linkedin.com/in/grayhemp
JID/GTalk: ***@gmail.com Skype: gray-hemp
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Loading...