[ADMIN] Backing up a replication set every 30 mins

Discussion:

(too old to reply)

Khusro Jaleel

2012-02-15 12:42:14 UTC

Hi,

I have setup replication on Postgresql 9.1 in a simple scenario using a
master and a slave. I now need a reliable way (and preferably fast way)
to do backups of the database.

I tried using 'archive_command' on the master and created a script that
simply bzips files and copies them to a directory, but after 1 day that
directory was so huge that even a simple 'rm' complained that the
'argument list is too long'.

I can see that even though I have specified 30 minutes in the
postgresql.conf, a new archive file is written every *minute* resulting
in way too many files in the archive directory. This happens even when
*nobody* is using the DB, or with just very light use. However I have
noticed that sometimes this *does* work correctly and an archive is only
written out every 30 mins.

Is there a way to force this to just write to the archive directory
*only* every 30 mins as I only really need backups to be current to the
last 30 mins or so?

===============
checkpoint_segments = 3 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 30min # range 30s-1h

archive_timeout = 30 # force a logfile segment switch after this
wal_keep_segments = 14375
===============

If this is NOT the right way to backup the database, then can you please
suggest an alternative method? I could write a script that calls
'pg_start_backup' and 'pg_stop_backup' with an rsync in between to a
backup server and run it every 30 mins. I am thinking that this would
work, even if people are busy using the DB (reads/writes)?

I want to avoid using pg_dump as I think that would require I pause
writes to the DB until the backup is finished?

Any help appreciated, thanks,
Khusro

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Vladimir Rusinov

2012-02-15 12:58:41 UTC

Permalink

Post by Khusro Jaleel
Hi,
I have setup replication on Postgresql 9.1 in a simple scenario using a
master and a slave. I now need a reliable way (and preferably fast way) to
do backups of the database.
I tried using 'archive_command' on the master and created a script that
simply bzips files and copies them to a directory, but after 1 day that
directory was so huge that even a simple 'rm' complained that the 'argument
list is too long'.
I can see that even though I have specified 30 minutes in the
postgresql.conf, a new archive file is written every *minute* resulting in
way too many files in the archive directory. This happens even when
*nobody* is using the DB, or with just very light use. However I have
noticed that sometimes this *does* work correctly and an archive is only
written out every 30 mins.
Is there a way to force this to just write to the archive directory *only*
every 30 mins as I only really need backups to be current to the last 30
mins or so?

No, there's no way to do this. If you have good number of
transactions/minute (and/or the transactions are quite large) you will have
a lot of stuff written to transaction log, so wal files would rotate quite
often,

Post by Khusro Jaleel
I want to avoid using pg_dump as I think that would require I pause writes
to the DB until the backup is finished?

pg_dump won't block writes, thanks to MVCC. It may increase bloat and it
will block DDL operations (ALTER TABLE/etc), but if your database is
relatively small but have high load and you need frequent backups, this may
be a way to go.

--
Vladimir Rusinov
http://greenmice.info/

Khusro Jaleel

2012-02-15 14:05:26 UTC

Permalink

Post by Vladimir Rusinov
pg_dump won't block writes, thanks to MVCC. It may increase bloat and it
will block DDL operations (ALTER TABLE/etc), but if your database is
relatively small but have high load and you need frequent backups, this
may be a way to go.
--
Vladimir Rusinov
http://greenmice.info/

Thanks Vladimir. Would a simple script with 'pg_start_backup' and
'pg_stop_backup' and an rsync job or tar job in between would work
equally well? I thought that was the better way to do it, rather than
pg_dump?

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Kevin Grittner

2012-02-15 14:48:17 UTC

Permalink

Post by Khusro Jaleel

Post by Vladimir Rusinov
pg_dump won't block writes, thanks to MVCC. It may increase bloat
and it will block DDL operations (ALTER TABLE/etc), but if your
database is relatively small but have high load and you need
frequent backups, this may be a way to go.

Thanks Vladimir. Would a simple script with 'pg_start_backup' and
'pg_stop_backup' and an rsync job or tar job in between would
work equally well? I thought that was the better way to do it,
rather than pg_dump?

The PITR style backup you describe doesn't cause bloat or block DDL,
and if you archive the WAL files you can restore to any point in
time following the pg_stop_backup. pg_dump just gives you a
snapshot as of the start of the dump, so if you use that you would
need to start a complete dump every 30 minutes. With PITR backups
and WAL archiving you could set your archvie_timeout to force timely
archiving (or use streaming replication if you are on 9.0 or later)
and effectively dump incremental database *activity* to stay
up-to-date.

Now, if 30 minutes of activity is more than the size of the
database, pg_dump could, as Vladimir says, still be a good
alternative.

-Kevin

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Khusro Jaleel

2012-02-15 15:17:54 UTC

Permalink

Post by Kevin Grittner
The PITR style backup you describe doesn't cause bloat or block DDL,
and if you archive the WAL files you can restore to any point in
time following the pg_stop_backup. pg_dump just gives you a
snapshot as of the start of the dump, so if you use that you would
need to start a complete dump every 30 minutes.

Sounds like my pg_start/rsync/pg_stop script solution every 30 mins
might be better then, as long as the jobs don't overlap :-)

With PITR backups

Post by Kevin Grittner
and WAL archiving you could set your archvie_timeout to force timely
archiving (or use streaming replication if you are on 9.0 or later)
and effectively dump incremental database *activity* to stay
up-to-date.

Well, I am already using streaming replication to a slave, and I also
have archive_timeout set to 30 minutes, but it seems that writes occur
more often, probably every minute or so. I'm not sure why that is, is it
because of the replication, or is it because the Java app using the DB
is perhaps changing something slightly in the DB every minute or so?
Nobody is actually using this DB, I just brought it up, so there is no
load, just two front-end Java app servers connected, doing nothing (I
hope, but maybe they are).

Post by Kevin Grittner
Now, if 30 minutes of activity is more than the size of the
database, pg_dump could, as Vladimir says, still be a good
alternative.

I'm not sure I understand what you said there. I think you are saying
that if the DB doubles or more in size in 30 minutes due to the
activity, then pg_dump is still a good alternative?

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Kevin Grittner

2012-02-15 15:39:43 UTC

Permalink

Post by Khusro Jaleel
Sounds like my pg_start/rsync/pg_stop script solution every 30
mins might be better then, as long as the jobs don't overlap :-)

That sounds like it's probably overkill. Once you have your base
backup, you can just accumulate WAL files. We do a base backup once
per week and keep the last two base backups plus all WAL files from
the start of the first one. We can restore to any particular point
in time after that earlier base backup. I've heard of people
happily going months between base backups, and just counting on WAL
file replay, although I'm slightly too paranoid to want to go that
far.

Post by Khusro Jaleel

Post by Kevin Grittner
Now, if 30 minutes of activity is more than the size of the
database, pg_dump could, as Vladimir says, still be a good
alternative.

I'm not sure I understand what you said there. I think you are
saying that if the DB doubles or more in size in 30 minutes due to
the activity, then pg_dump is still a good alternative?

Not exactly. I was saying that if you have a very unusual situation
where the database is very small but has very high volumes of
updates (or inserts and deletes) such that it stays very small while
generating a lot of WAL, it is within the realm of possibility that
a pg_dump every 30 minutes could be your best option. I haven't
seen such a database yet, but I was conceding the possibility that
such could exist.

-Kevin

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Khusro Jaleel

2012-02-15 16:05:32 UTC

Permalink

Post by Kevin Grittner

Post by Khusro Jaleel
Sounds like my pg_start/rsync/pg_stop script solution every 30
mins might be better then, as long as the jobs don't overlap :-)

That's exactly what I was trying to accomplish, however I tried to do a
base backup every day and have archives during the day. This worked fine
in testing, however when I set it up and attached the Java front-ends to
the DB, there were *so* many archive files written to disk that the
"rotate" job that runs every morning to do a new base backup failed. It
failed because there were thousands upon thousands of archive files in
the archive dir and it couldn't delete them. Why this happened I am not
sure, I thought setting archive_timeout = 30 would only create 1 archive
file every 30 minutes, but I was wrong. The Java application itself was
pretty much idle the whole time, not sure though if the ORM solution
used was perhaps writing something to the DB every now and then causing
the archives to be *flushed* to disk much earlier than 30 mins?

Post by Kevin Grittner
Not exactly. I was saying that if you have a very unusual situation
where the database is very small but has very high volumes of
updates (or inserts and deletes) such that it stays very small while
generating a lot of WAL, it is within the realm of possibility that
a pg_dump every 30 minutes could be your best option. I haven't
seen such a database yet, but I was conceding the possibility that
such could exist.

Yes, this is a possibility, thanks for clarifying it. The database won't
be very big even after some months I think so I might do it this way.
However I prefer to get the first PITR solution working right. If that
can be forced to flush archives *only* every 30 mins I would be very
pleased. But is it possible that because of the constant replication to
the slave, this can never be accomplished on the master?

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Ian Lea

2012-02-15 16:19:26 UTC

Permalink

...

That's exactly what I was trying to accomplish, however I tried to do a base
backup every day and have archives during the day. This worked fine in
testing, however when I set it up and attached the Java front-ends to the
DB, there were *so* many archive files written to disk that the "rotate" job
that runs every morning to do a new base backup failed. It failed because
there were thousands upon thousands of archive files in the archive dir and
it couldn't delete them. Why this happened I am not sure, I thought setting
archive_timeout = 30 would only create 1 archive file every 30 minutes, but
I was wrong.

If you've got archive_timeout = 30 that means 30 seconds, not 30 minutes.

--
Ian.

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Kevin Grittner

2012-02-15 16:21:58 UTC

Permalink

Post by Khusro Jaleel
I thought setting archive_timeout = 30 would only create 1
archive file every 30 minutes, but I was wrong.

The default unit of measure is seconds, so the above would create at
least two WAL files per minute. Try setting to '30min'.

-Kevin

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Khusro Jaleel

2012-02-15 17:09:03 UTC

Permalink

Post by Kevin Grittner

Post by Khusro Jaleel
I thought setting archive_timeout = 30 would only create 1
archive file every 30 minutes, but I was wrong.

The default unit of measure is seconds, so the above would create at
least two WAL files per minute. Try setting to '30min'.
-Kevin

For crying out loud! I did specify "30min" on other machines but put in
"30" on this one, I guess it was a typo! I hope it was this simple and
it works now! :-)

Thanks everyone for your help.

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin