Discussion:
PG 9.1 Looking for old WAL when promoting from recovery to master
(too old to reply)
David Morton
2012-09-03 22:01:17 UTC
Permalink
I'm implementing replica servers which will use a trigger file to promote
from hot standby to full read/write. I've configured streaming replication
as well as a recovery.conf which copies old WAL files from a repository if
required.

When placing the trigger file the system assumes the read/write roll
without issue but insists on looking for a really old WAL file ... the
below log file shows restoration from the previous nights full online
backup (rsync) along with the trigger file detection and then attempting to
find the old WAL file.

Is this behavior normal ? From what i can see its not writing any new WAL
files until it is satisfied with the state of this old one. If I create the
file its expecting to see it archives it off and then complains about the
next in the series.

2012-08-28 23:30:33 UTC LOG: restored log file
"000000010000002E00000030" from archive
2012-08-28 23:30:34 UTC LOG: restored log file
"000000010000002E00000031" from archive
2012-08-28 23:30:36 UTC LOG: restored log file
"000000010000002E00000032" from archive
2012-08-28 23:30:37 UTC LOG: restored log file
"000000010000002E00000033" from archive
2012-08-28 23:30:39 UTC LOG: restored log file
"000000010000002E00000034" from archive
2012-08-28 23:30:42 UTC LOG: restored log file
"000000010000002E00000035" from archive
2012-08-28 23:30:44 UTC LOG: restored log file
"000000010000002E00000036" from archive
2012-08-28 23:30:45 UTC LOG: restored log file
"000000010000002E00000037" from archive
cp: cannot stat `/NFS/current/wal/depot/000000010000002E00000038': No such
file or directory
2012-08-28 23:30:47 UTC LOG: streaming replication successfully
connected to primary
2012-08-28 23:42:09 UTC LOG: trigger file found:
/home/depot/data/transition_to_master.trigger
2012-08-28 23:42:09 UTC FATAL: terminating walreceiver process due to
administrator command
cp: cannot stat `/NFS/current/wal/depot/000000010000002E00000039': No such
file or directory
2012-08-28 23:42:09 UTC LOG: record with zero length at 2E/39079E00
cp: cannot stat `/NFS/current/wal/depot/000000010000002E00000039': No such
file or directory
2012-08-28 23:42:09 UTC LOG: redo done at 2E/39079DC0
2012-08-28 23:42:09 UTC LOG: last completed transaction was at log time
2012-08-28 23:42:02.226546+00
cp: cannot stat `/NFS/current/wal/depot/000000010000002E00000039': No such
file or directory
cp: cannot stat `/NFS/current/wal/depot/00000002.history': No such file or
directory
2012-08-28 23:42:09 UTC LOG: selected new timeline ID: 2
cp: cannot stat `/NFS/current/wal/depot/00000001.history': No such file or
directory
2012-08-28 23:42:10 UTC LOG: archive recovery complete
2012-08-28 23:42:10 UTC LOG: database system is ready to accept
connections
2012-08-28 23:42:10 UTC LOG: autovacuum launcher started
pg_xlog/000000010000001D00000023: No such file or directory
2012-08-28 23:42:10 UTC LOG: archive command failed with exit code 1
2012-08-28 23:42:10 UTC DETAIL: The failed archive command was:
/DB_SHARED/dbcommon/scripts/logarchive.sh pg_xlog/000000010000001D00000023
000000010000001D00000023
pg_xlog/000000010000001D00000023: No such file or directory
2012-08-28 23:42:11 UTC LOG: archive command failed with exit code 1
2012-08-28 23:42:11 UTC DETAIL: The failed archive command was:
/DB_SHARED/dbcommon/scripts/logarchive.sh pg_xlog/000000010000001D00000023
000000010000001D00000023
pg_xlog/000000010000001D00000023: No such file or directory
2012-08-28 23:42:13 UTC LOG: archive command failed with exit code 1
2012-08-28 23:42:13 UTC DETAIL: The failed archive command was:
/DB_SHARED/dbcommon/scripts/logarchive.sh pg_xlog/000000010000001D00000023
000000010000001D00000023
2012-08-28 23:42:13 UTC WARNING: transaction log file
"000000010000001D00000023" could not be archived: too many failures
pg_xlog/000000010000001D00000023: No such file or directory
2012-08-28 23:43:13 UTC LOG: archive command failed with exit code 1
2012-08-28 23:43:13 UTC DETAIL: The failed archive command was:
/DB_SHARED/dbcommon/scripts/logarchive.sh pg_xlog/000000010000001D00000023
000000010000001D00000023
pg_xlog/000000010000001D00000023: No such file or directory
2012-08-28 23:43:14 UTC LOG: archive command failed with exit code 1
2012-08-28 23:43:14 UTC DETAIL: The failed archive command was:
/DB_SHARED/dbcommon/scripts/logarchive.sh pg_xlog/000000010000001D00000023
000000010000001D00000023
pg_xlog/000000010000001D00000023: No such file or directory
2012-08-28 23:43:15 UTC LOG: archive command failed with exit code 1
2012-08-28 23:43:15 UTC DETAIL: The failed archive command was:
/DB_SHARED/dbcommon/scripts/logarchive.sh pg_xlog/000000010000001D00000023
000000010000001D00000023
2012-08-28 23:43:15 UTC WARNING: transaction log file
"000000010000001D00000023" could not be archived: too many failures
Fujii Masao
2012-09-05 14:08:15 UTC
Permalink
Post by David Morton
I'm implementing replica servers which will use a trigger file to promote
from hot standby to full read/write. I've configured streaming replication
as well as a recovery.conf which copies old WAL files from a repository if
required.
When placing the trigger file the system assumes the read/write roll without
issue but insists on looking for a really old WAL file ... the below log
file shows restoration from the previous nights full online backup (rsync)
along with the trigger file detection and then attempting to find the old
WAL file.
Is this behavior normal ?
No. I think that the cause of the failure of archive_command is that
the archive status file of old WAL file exists in pg_xlog/archive_status
directory. Though I'm not sure why that happened. That's strange
since the archive status file should be removed when its corresponding
WAL file is removed. Anyway, if you delete the archive status file,
archive_command would be completed successfully.

Regards,
--
Fujii Masao
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Loading...