Jean-Armel Luce
2011-11-12 14:15:56 UTC
Hi all,
I am using Postgres 9.0.3 in production with Slony 2.0.6 for replication.
I am currently testing the streaming replication with 9.0.3. My database
contains 100 GBytes of data (6000 tables).
I have 1 master node and 3 slave nodes. The 1st slave node is in the same
datacenter as the master node. The 2 other slave nodes are in a 2nd distant
datacenter.
Say node 1 is the master node, node 2 is the local slave node, and nodes 3
and 4 are the slave nodes in the distant datacenter.
After installation of the streaming replication, I have checked that all
the files have been copied from the master to the slaves and that files
have the same size and modification date on each node.
When I send update requests, the streaming replication works perfectly; I
am able to replicate many thousands of updates per second.
I have updated, inserted and deleted nearly 70% of the rows in all the
tables (mainly update requests), and replication is working perfectly.
However; when I want to promote node 3 as the master, I need to rsync all
the logs and databases from node 3 to node 4, node 1 and node 2.
Even if I use rsync -a (as suggested in the wiki
http://wiki.postgresql.org/wiki/Streaming_Replication), it takes a long
time (nearly 2 hours for a distant slave).
It looks that all database files do not have the same modification date in
the master node and in the slave nodes, so the rsync copies quite all the
database from the new master to the slaves.
At first, I was suspecting vacuum process for modifying the files not
simultaneously in the master and the slaves. When I check the dates last
autoanalyze and last_autovacuum fields in pg_stat_user_tables, they are
diffent than the mtime of the files; so it looks autovacuum is not
responsible of that.
Please, could you help me to understand why it is so long to rsync the data
from the new master to the other slaves ?
Did I miss anything ?
Any idea ?
Jean-Armel
I am using Postgres 9.0.3 in production with Slony 2.0.6 for replication.
I am currently testing the streaming replication with 9.0.3. My database
contains 100 GBytes of data (6000 tables).
I have 1 master node and 3 slave nodes. The 1st slave node is in the same
datacenter as the master node. The 2 other slave nodes are in a 2nd distant
datacenter.
Say node 1 is the master node, node 2 is the local slave node, and nodes 3
and 4 are the slave nodes in the distant datacenter.
After installation of the streaming replication, I have checked that all
the files have been copied from the master to the slaves and that files
have the same size and modification date on each node.
When I send update requests, the streaming replication works perfectly; I
am able to replicate many thousands of updates per second.
I have updated, inserted and deleted nearly 70% of the rows in all the
tables (mainly update requests), and replication is working perfectly.
However; when I want to promote node 3 as the master, I need to rsync all
the logs and databases from node 3 to node 4, node 1 and node 2.
Even if I use rsync -a (as suggested in the wiki
http://wiki.postgresql.org/wiki/Streaming_Replication), it takes a long
time (nearly 2 hours for a distant slave).
It looks that all database files do not have the same modification date in
the master node and in the slave nodes, so the rsync copies quite all the
database from the new master to the slaves.
At first, I was suspecting vacuum process for modifying the files not
simultaneously in the master and the slaves. When I check the dates last
autoanalyze and last_autovacuum fields in pg_stat_user_tables, they are
diffent than the mtime of the files; so it looks autovacuum is not
responsible of that.
Please, could you help me to understand why it is so long to rsync the data
from the new master to the other slaves ?
Did I miss anything ?
Any idea ?
Jean-Armel