[ADMIN] PostgreSQL oom_adj postmaster process to -17

Discussion:

(too old to reply)

Radovan Jablonovsky

2012-07-24 19:18:41 UTC

Hello,

We are running PostgreSQL version 9.1.1 with 32GB of RAM, 32GB of SWAP and
during high load we could reach (swap + RAM) memory limit. In this case
OOM-killer kills postgresql process(es). (Out of Memory: Killed process
12345 (postgres)). As admin I would like to exclude postgresql system
processes from being chosen by OOM-killer. Based on the PostgreSQL
documentation it could be done by properly setting vm.overcommit_memory=2,
vm.overcommit_ratio=(probably between 50 and 90) and set the
process-specific oom_adj value for the postmaster process to -17, thereby
guaranteeing it will not be targeted by the OOM killer. The PostgreSQL
should build with -DLINUX_OOM_ADJ=0 added to CPPFLAGS to have child
processes oom_adj equal to 0.

1) Will this setting means other system processes (logger process, writer
process, stats collector process, wal receiver process, etc..), which are
children of postmaster process, will be running with oom_adj set to 0 too?
2) Should the proper Postgresql compilation configuration be: ./configure
... CPPFLAGS="-DLINUX_OOM_ADJ=0" ...
3) Does somebody has experience with this solution for CentOS
kernel 2.6.18-308.el5 x86_64?

Sincerely,

*Radovan Jablonovsky*
**

Kevin Grittner

2012-08-01 17:48:45 UTC

Permalink

Post by Radovan Jablonovsky
We are running PostgreSQL version 9.1.1

You should apply the latest bug fixes by updating to 9.1.4.

http://www.postgresql.org/support/versioning/

Post by Radovan Jablonovsky
with 32GB of RAM, 32GB of SWAP and during high load we could reach
(swap + RAM) memory limit.

If you're even *starting* to swap you're doing something wrong, much
less exhausting swap space equal to actual RAM. What is your
configuration?

http://wiki.postgresql.org/wiki/Server_Configuration

While it's probably a good idea to configure the OOM killer to
behave more sanely, we tend to ignore it in favor of ensuring that
it never comes into play. We run about 200 databases 24/7 and I
think I've seen it kick in about twice -- when we ran queries that
leaked memory on each row in a big query.

-Kevin

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Radovan Jablonovsky

2012-08-03 18:08:45 UTC

Permalink

Thanks you for your response.

Database config:
shared_buffers = 8GB
temp_buffers = 32MB
work_mem = 64MB
maintenance_work_mem = 512MB
effective_cache_size = 16GB

In usual load there are not much pressure on memory, but it is possible to
have all clients start using heavy reports. They are valid requests and
could consume all memory. In this border and not likely but possible
scenario it could be useful to let OOM killer to kill client's
processes/connections but leave PostgreSQL system processes (postmaster,
writer, stat, log, streaming, ...) excluded from reach of OOM killer.

Post by Kevin Grittner

Post by Radovan Jablonovsky
We are running PostgreSQL version 9.1.1

You should apply the latest bug fixes by updating to 9.1.4.
http://www.postgresql.org/support/versioning/

Post by Radovan Jablonovsky
with 32GB of RAM, 32GB of SWAP and during high load we could reach
(swap + RAM) memory limit.

If you're even *starting* to swap you're doing something wrong, much
less exhausting swap space equal to actual RAM. What is your
configuration?
http://wiki.postgresql.org/wiki/Server_Configuration
While it's probably a good idea to configure the OOM killer to
behave more sanely, we tend to ignore it in favor of ensuring that
it never comes into play. We run about 200 databases 24/7 and I
think I've seen it kick in about twice -- when we ran queries that
leaked memory on each row in a big query.
-Kevin

Radovan

Tom Lane

2012-08-03 18:32:52 UTC

Permalink

Post by Radovan Jablonovsky
In usual load there are not much pressure on memory, but it is possible to
have all clients start using heavy reports. They are valid requests and
could consume all memory. In this border and not likely but possible
scenario it could be useful to let OOM killer to kill client's
processes/connections but leave PostgreSQL system processes (postmaster,
writer, stat, log, streaming, ...) excluded from reach of OOM killer.

The code already supports excluding the postmaster itself from OOM kills
while letting its children be subject to them. Being selective about
which children are subject is pointless, though: if any child is kill
-9'd, we have to zap the rest and restart, because there is no way to be
sure that the victim left shared memory in a consistent state.

regards, tom lane

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Kevin Grittner

2012-08-03 19:03:19 UTC

Permalink

Your clients will get their results back faster if you can arrange
some way to queue these sorts of requests when they get beyond some
reasonable limit. You might be able to do that using a connection
pool, or you might want to create some sort of job queue which
releases a limited number of such jobs at a time; but I guarantee
that every single person to submit a job to such a queue, including
the last person, will get their results sooner with such queuing
than turning loose a "thundering herd" of requests which puts the
system into swapping. I guarantee it.

-Kevin

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Scott Marlowe

2012-08-03 20:05:25 UTC

Permalink

On Fri, Aug 3, 2012 at 12:08 PM, Radovan Jablonovsky

Post by Radovan Jablonovsky
Thanks you for your response.
shared_buffers = 8GB
temp_buffers = 32MB
work_mem = 64MB
maintenance_work_mem = 512MB
effective_cache_size = 16GB
In usual load there are not much pressure on memory, but it is possible to
have all clients start using heavy reports. They are valid requests and
could consume all memory. In this border and not likely but possible
scenario it could be useful to let OOM killer to kill client's
processes/connections but leave PostgreSQL system processes (postmaster,
writer, stat, log, streaming, ...) excluded from reach of OOM killer.

You're only realistic solution is to either limit the incoming
connections via a connection pooler like pgbouncer or to lower your
work_mem to something smaller. What's you're current max connections
setting?

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Radovan Jablonovsky

2012-08-08 20:04:10 UTC

Permalink

Currently there are maximum 600 connections.

Post by Scott Marlowe
On Fri, Aug 3, 2012 at 12:08 PM, Radovan Jablonovsky

Post by Radovan Jablonovsky
have all clients start using heavy reports. They are valid requests and
could consume all memory. In this border and not likely but possible
scenario it could be useful to let OOM killer to kill client's
processes/connections but leave PostgreSQL system processes (postmaster,
writer, stat, log, streaming, ...) excluded from reach of OOM killer.

--
*Radovan Jablonovsky* | SaaS DBA | Phone 1-403-262-6519 (ext. 256) | Fax
1-403-233-8046****

*

Replicon | Hassle-Free Time & Expense Management Software - 7,300 Customers
- 70 Countries
www.replicon.com | facebook <http://www.facebook.com/Replicon.inc> |
twitter<http://twitter.com/Replicon>
| blog <http://www.replicon.com/blog/> | contact
us<http://www.replicon.com/about_replicon/contact_us.aspx>

We are hiring! | search
jobs<http://tbe.taleo.net/NA2/ats/careers/searchResults.jsp?org=REPLICON&cws=1&act=sort&sortColumn=1&__utma=1.651918544.1299001662.1299170819.1299174966.10&__utmb=1.8.10.1299174966&__utmc=1&__utmx=-&__utmz=1.1299174985.10.3.utmcsr=google%7Cutmccn=(organic)%7Cutmcmd=organic%7Cutmctr=replicon%20careers&__utmv=1.%7C3=Visitor%20Type=Prospects=1,&__utmk=40578466>
*

Kevin Grittner

2012-08-08 20:24:08 UTC

Permalink

Post by Radovan Jablonovsky
PostgreSQL version 9.1.1 with 32GB of RAM
shared_buffers = 8GB
temp_buffers = 32MB
work_mem = 64MB
maintenance_work_mem = 512MB
Currently there are maximum 600 connections.

Please read:

http://wiki.postgresql.org/wiki/Number_Of_Database_Connections

-Kevin

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Craig Ringer

2012-08-09 01:33:55 UTC

Permalink

Post by Kevin Grittner
http://wiki.postgresql.org/wiki/Number_Of_Database_Connections

Can we please please PLEASE link to that as a comment above max_connections?

Last time this came up nobody was happy with wording of a comment so
nothing got done. It's a real usability wart - causing real-world
performance and reliability problems - that people unwittingly raise
max_connections to absurd levels because they get no warnings, hints or
guidance of any sort.

--
Craig Ringer

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Kevin Grittner

2012-08-09 14:09:15 UTC

Permalink

Post by Craig Ringer

Post by Kevin Grittner
http://wiki.postgresql.org/wiki/Number_Of_Database_Connections

Can we please please PLEASE link to that as a comment above
max_connections?
Last time this came up nobody was happy with wording of a comment
so nothing got done. It's a real usability wart - causing
real-world performance and reliability problems - that people
unwittingly raise max_connections to absurd levels because they
get no warnings, hints or guidance of any sort.

I see that we currently have five links to wiki.postgresql.org in
release notes and four more in the rest of the docs. Are people OK
with adding this link to the docs on max_connections? (Feel free to
improve it before answering if you have qualms about the specifics
on that page.)

We do seem to get an awful lot of posts (between here and
StackOverflow) from people who assume they need one database
connection per active user, and then are surprised that performance
is horrible.

If we get consensus on linking to this I'll put together a patch to
make a brief comment in the docs with a link to the Wiki.

-Kevin

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin