Discussion:
[ADMIN] starting postgres with an empty px_xlog folder
(too old to reply)
Mike Broers
2012-06-23 12:01:08 UTC
Permalink
Hello,

We shut down our postgres 8.3 server last night cleanly for some hosted
services maintenance. When we got our server back, it didnt have the
pg_xlog mount with files and now when we start the server, it complains:

2012-06-23 06:06:04 CDT [18612]: [1-1] user=,db= LOG: database system was
shut down at 2012-06-22 23:56:16 CDT
2012-06-23 06:06:04 CDT [18612]: [2-1] user=,db= LOG: could not open file
"pg_xlog/0000000100000A0100000040" (log file 2561, segment 64): No such
file or
directory
2012-06-23 06:06:04 CDT [18612]: [3-1] user=,db= LOG: invalid primary
checkpoint record
2012-06-23 06:06:04 CDT [18612]: [4-1] user=,db= LOG: could not open file
"pg_xlog/0000000100000A0100000040" (log file 2561, segment 64): No such
file or
directory
2012-06-23 06:06:04 CDT [18612]: [5-1] user=,db= LOG: invalid secondary
checkpoint record
2012-06-23 06:06:04 CDT [18612]: [6-1] user=,db= PANIC: could not locate a
valid checkpoint record
2012-06-23 06:06:04 CDT [18609]: [1-1] user=,db= LOG: startup process (PID
18612) was terminated by signal 6: Aborted
2012-06-23 06:06:04 CDT [18609]: [2-1] user=,db= LOG: aborting startup due
to startup process failure


Since we had a clean shut down is there an easy or relatively clean way to
start back up? the pg_xlog folder is there, just with no files. i wouldnt
think it would be needed for a clean restart.
Mike Broers
2012-06-23 12:49:25 UTC
Permalink
I was able to duplicate the error in a test environment and get the
database started back up wtih the pg_resetxlog $PGDATA command, but i am
concerned about the fallout and not sure if i need to use any of the option
switches. We also have archived log files up to the point of server
shutdown if that provides a better option.
Post by Mike Broers
Hello,
We shut down our postgres 8.3 server last night cleanly for some hosted
services maintenance. When we got our server back, it didnt have the
2012-06-23 06:06:04 CDT [18612]: [1-1] user=,db= LOG: database system was
shut down at 2012-06-22 23:56:16 CDT
2012-06-23 06:06:04 CDT [18612]: [2-1] user=,db= LOG: could not open file
"pg_xlog/0000000100000A0100000040" (log file 2561, segment 64): No such
file or
directory
2012-06-23 06:06:04 CDT [18612]: [3-1] user=,db= LOG: invalid primary
checkpoint record
2012-06-23 06:06:04 CDT [18612]: [4-1] user=,db= LOG: could not open file
"pg_xlog/0000000100000A0100000040" (log file 2561, segment 64): No such
file or
directory
2012-06-23 06:06:04 CDT [18612]: [5-1] user=,db= LOG: invalid secondary
checkpoint record
2012-06-23 06:06:04 CDT [18612]: [6-1] user=,db= PANIC: could not locate
a valid checkpoint record
2012-06-23 06:06:04 CDT [18609]: [1-1] user=,db= LOG: startup process
(PID 18612) was terminated by signal 6: Aborted
2012-06-23 06:06:04 CDT [18609]: [2-1] user=,db= LOG: aborting startup
due to startup process failure
Since we had a clean shut down is there an easy or relatively clean way to
start back up? the pg_xlog folder is there, just with no files. i wouldnt
think it would be needed for a clean restart.
Raghavendra
2012-06-23 13:36:34 UTC
Permalink
Post by Mike Broers
2012-06-23 06:06:04 CDT [18612]: [2-1] user=,db= LOG: could not open file
"pg_xlog/0000000100000A0100000040" (log file 2561, segment 64): No such
file or
directory
Don't do pg_resetlogs. If you have archives, please find for the file which
has pointed in logs copy it to $PGDATA/pg_xlog location and start the
server. Sometime, server will throw the sequence of files like 41,42....,
not existing, you need to copy all of them to pg_xlog location and start it.

---
Regards,
Raghavendra
EnterpriseDB Corporation
Blog: http://raghavt.blogspot.com/
Post by Mike Broers
I was able to duplicate the error in a test environment and get the
database started back up wtih the pg_resetxlog $PGDATA command, but i am
concerned about the fallout and not sure if i need to use any of the option
switches. We also have archived log files up to the point of server
shutdown if that provides a better option.
Post by Mike Broers
Hello,
We shut down our postgres 8.3 server last night cleanly for some hosted
services maintenance. When we got our server back, it didnt have the
2012-06-23 06:06:04 CDT [18612]: [1-1] user=,db= LOG: database system
was shut down at 2012-06-22 23:56:16 CDT
2012-06-23 06:06:04 CDT [18612]: [3-1] user=,db= LOG: invalid primary
checkpoint record
2012-06-23 06:06:04 CDT [18612]: [4-1] user=,db= LOG: could not open
file "pg_xlog/0000000100000A0100000040" (log file 2561, segment 64): No
such file or
directory
2012-06-23 06:06:04 CDT [18612]: [5-1] user=,db= LOG: invalid secondary
checkpoint record
2012-06-23 06:06:04 CDT [18612]: [6-1] user=,db= PANIC: could not locate
a valid checkpoint record
2012-06-23 06:06:04 CDT [18609]: [1-1] user=,db= LOG: startup process
(PID 18612) was terminated by signal 6: Aborted
2012-06-23 06:06:04 CDT [18609]: [2-1] user=,db= LOG: aborting startup
due to startup process failure
Since we had a clean shut down is there an easy or relatively clean way
to start back up? the pg_xlog folder is there, just with no files. i
wouldnt think it would be needed for a clean restart.
Ray Stell
2012-06-23 14:45:24 UTC
Permalink
Post by Mike Broers
Hello,
We shut down our postgres 8.3 server last night cleanly for some hosted
services maintenance. When we got our server back, it didnt have the
pg_xlog mount with files
what happened to make the files not available? Seems like you may want
to know the answer to this question for future reference.
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Kevin Grittner
2012-06-23 19:45:30 UTC
Permalink
Post by Mike Broers
Post by Mike Broers
We shut down our postgres 8.3 server last night cleanly for some
hosted services maintenance. When we got our server back, it didnt
have the pg_xlog mount with files and now when we start the
server, it complains
Since we had a clean shut down is there an easy or relatively
clean way to start back up? the pg_xlog folder is there, just with
no files. i wouldnt think it would be needed for a clean restart.
It is.
Post by Mike Broers
I was able to duplicate the error in a test environment and get the
database started back up wtih the pg_resetxlog $PGDATA command, but
i am concerned about the fallout and not sure if i need to use any
of the option switches. We also have archived log files up to the
point of server shutdown if that provides a better option.
I would make a copy of the directory tree of the database cluster
while the server is stopped before attempting any recovery.

http://wiki.postgresql.org/wiki/Corruption

The archived WAL files, may provide a better option that
pg_resetxlog.

If it were my data, I would be inclined to use pg_dump to create a
fresh version of the database after recovery, although if you really
have a full set of archived WAL files and the server starts up
without any indications of problems in the server log, you might be
OK without doing that.

As previously stated, make sure you understand how this happened, so
you can make sure it doesn't happen again. The contents of the
pg_xlog directory are an integral part of your database cluster.

-Kevin
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Craig Ringer
2012-06-24 07:15:40 UTC
Permalink
Post by Kevin Grittner
As previously stated, make sure you understand how this happened, so
you can make sure it doesn't happen again. The contents of the
pg_xlog directory are an integral part of your database cluster.
People not backing up pg_xlog, deleting its contents, etc happens often
enough that I wonder if it should have a prominent 00_README or
00_WARNING_DONT_DELETE file created by initdb - or simply be renamed to
something scarier like "base_txrecords".

The annoyance of a readme is that it'd need translation.

People are used to logs being disposable. Anyone who's been responsible
for a database should ideally know better than to assume that
*transation* logs are disposable, but everyone has to learn sometime,
and not everybody does so by reading TFM (unfortunately).

--
Craig Ringer
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Stuart Bishop
2012-06-24 08:15:14 UTC
Permalink
Post by Craig Ringer
Post by Kevin Grittner
As previously stated, make sure you understand how this happened, so
you can make sure it doesn't happen again.  The contents of the
pg_xlog directory are an integral part of your database cluster.
People not backing up pg_xlog, deleting its contents, etc happens often
enough that I wonder if it should have a prominent 00_README or
00_WARNING_DONT_DELETE file created by initdb - or simply be renamed to
something scarier like "base_txrecords".
This doesn't help when your backup skips them because the files exist
on another disk partition. Out of sight, out of mind.

If pg_basebackup defaulted to --xlog and its use was encouraged these
mistakes might be less common.
--
Stuart Bishop <***@stuartbishop.net>
http://www.stuartbishop.net/
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Mike Broers
2012-06-25 14:45:25 UTC
Permalink
Ultimately the hosting service restored the files that they had not brought
over during their maintenance migration and we started up ok. So that was
a relief.

We had archived log files but it did not appear that the archive
destination was caught up with the xlog the cluster was complaining about.


Given that the database server was shut down cleanly, and all other data
besides pg_xlog was available as expected (not corrupted), what would have
been the problem with pg_resetxlogs?
Post by Craig Ringer
Post by Kevin Grittner
As previously stated, make sure you understand how this happened, so
you can make sure it doesn't happen again. The contents of the
pg_xlog directory are an integral part of your database cluster.
People not backing up pg_xlog, deleting its contents, etc happens often
enough that I wonder if it should have a prominent 00_README or
00_WARNING_DONT_DELETE file created by initdb - or simply be renamed to
something scarier like "base_txrecords".
The annoyance of a readme is that it'd need translation.
People are used to logs being disposable. Anyone who's been responsible
for a database should ideally know better than to assume that *transation*
logs are disposable, but everyone has to learn sometime, and not everybody
does so by reading TFM (unfortunately).
--
Craig Ringer
Kevin Grittner
2012-06-25 15:19:28 UTC
Permalink
Post by Mike Broers
Ultimately the hosting service restored the files that they had
not brought over during their maintenance migration and we started
up ok. So that was a relief.
+1
Post by Mike Broers
We had archived log files but it did not appear that the archive
destination was caught up with the xlog the cluster was
complaining about.
Given that the database server was shut down cleanly, and all
other data besides pg_xlog was available as expected (not
corrupted), what would have been the problem with pg_resetxlogs?
Frankly, the odds would have been pretty good that you would have
come up without lost data or a corrupted database; but it's a matter
of the degree of confidence in that. Startup and shutdown code, by
its nature, is not exercised as heavily as most PostgreSQL code.
Startup after using a data recovery utility is even less heavily
exercised. Less frequently executed code is more likely to have
subtle bugs which only show up in rare corner cases. I like to
minimize my risk.

-Kevin
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Scott Marlowe
2012-06-25 15:44:31 UTC
Permalink
On Mon, Jun 25, 2012 at 9:19 AM, Kevin Grittner
Post by Kevin Grittner
Post by Mike Broers
Ultimately the hosting service restored the files that they had
not brought over during their maintenance migration and we started
up ok.  So that was a relief.
+1
Post by Mike Broers
We had archived log files but it did not appear that the archive
destination was caught up with the xlog the cluster was
complaining about.
Given that the database server was shut down cleanly, and all
other data besides pg_xlog was available as expected (not
corrupted), what would have been the problem with pg_resetxlogs?
Frankly, the odds would have been pretty good that you would have
come up without lost data or a corrupted database; but it's a matter
of the degree of confidence in that.  Startup and shutdown code, by
its nature, is not exercised as heavily as most PostgreSQL code.
Startup after using a data recovery utility is even less heavily
exercised.  Less frequently executed code is more likely to have
subtle bugs which only show up in rare corner cases.  I like to
minimize my risk.
-Kevin
Agreed 100%. Also, this brings up the question of how your server
setup is documented by the hosting providor. If you've got remote
drives, virtual drives or whatever, if needs to be documented there in
such a way they can move you without losing things. Or you'll have to
remind them each time you have a separate mount point that needs to
get transferred, but that's error prone.
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Loading...