Discussion:
[ADMIN] Very long "<IDLE> in transaction" query
(too old to reply)
Gnanakumar
2012-05-03 10:18:35 UTC
Permalink
Hi,

Recently, in our Production server, we found a "single query" being held up
in "<IDLE> in transaction" for more than 19 hours using the following query:
select date_trunc('second', current_timestamp - query_start) as runtime,
datname as database_name, current_query from pg_stat_activity where
current_query != '<IDLE>' order by 1 desc

but we're clueless which was the root cause of this issue and still hunting.
As we know, query output doesn't show up the actual query/statement.

We then ran the 3rd query available from PostgreSQL Wiki - Lock Monitoring
http://wiki.postgresql.org/wiki/Lock_Monitoring From query result output, I
could infer only the following but still not able to find out the real root
cause:
1) 2 tables are involved (table1 and table2)
2) Mostly table1's indexes are appearing in the output.

Pasted below result output containing only "<IDLE> in transaction". For
security reasons, I've masked/renamed table names and index names in
"relname" column. Though all index names are renamed to as
"table1_xxxxx_indx", all are different index names and not the same index.
Yes, we do have more than 30 indexes in table1.

Can somebody help me out what is going wrong/causing these "<IDLE> in
transaction"?

datname | relname | transactionid | mode |
granted | usename | substr | query_start
| age | procpid
---------+-------------------------------+---------------+-----------------+
---------+---------+--------------------------------+-----------------------
--------+------------------+---------
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table2 | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_pk | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_idx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx_indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx__indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1 | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | | 668748028 | ExclusiveLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx__indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740
prodndb | table1_xxxxx__indx | | AccessShareLock |
t | dbuser | <IDLE> in transaction | 2012-05-01
14:56:53.617912-04 | 19:20:59.644261 | 14740

But no rows were returned for the 2nd query available in PostgreSQL Wiki -
Lock Monitoring.

We're running PostgreSQL v8.2.22 and pgpool v3.1.1 (only connection pooling
feature is used).

Regards,
Gnanam
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Jan Lentfer
2012-05-03 11:15:37 UTC
Permalink
Post by Gnanakumar
Hi,
Recently, in our Production server, we found a "single query" being held up
[...]

Usually these are application problems, not database ones. E.g. a piece
of code not cathing an error correctly or breaking out of a loop or
something like that and leaving the transaction open (not issuing COMMIT
or ROLLBACK). Seens this quite often in production and it was always
tracebla back to an application error.

To find out what the session has last done before it started idling you
would need to turn on statement logging. Then at least you might be able
to trace down in which part of you application this was caused.


Jan
--
professional: http://www.oscar-consult.de
private: http://neslonek.homeunix.org/drupal/
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Albe Laurenz
2012-05-03 14:06:05 UTC
Permalink
Post by Gnanakumar
Recently, in our Production server, we found a "single query" being held up
select date_trunc('second', current_timestamp - query_start) as runtime,
datname as database_name, current_query from pg_stat_activity where
current_query != '<IDLE>' order by 1 desc
but we're clueless which was the root cause of this issue and still hunting.
As we know, query output doesn't show up the actual query/statement.
You won't be able to find the cause in PostgreSQL.

The cause is a database session that started a transaction, did some
work and never closed the transaction.

PostgreSQL can help you find out who the offending client is:

SELECT application_name, client_addr, client_hostname, client_port
FROM pg_stat_activity
WHERE procpid = 14740;

(Replace 14740 of the process ID of the "idle in transaction" backend).

Look on the client machine and find the process that holds TCP port
"client_port" open (on Linux you can use "lsof" for that).

Then you have found the culprit!

Yours,
Laurenz Albe
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Gnanakumar
2012-05-04 09:52:47 UTC
Permalink
Post by Albe Laurenz
SELECT application_name, client_addr, client_hostname, client_port
FROM pg_stat_activity
WHERE procpid = 14740;
(Replace 14740 of the process ID of the "idle in transaction" backend).
Look on the client machine and find the process that holds TCP port
"client_port" open (on Linux you can use "lsof" for that).
Ours is a web-based application and all calls are made "only" from where the
web server is running. No external calls are allowed in my case. Hence,
the "only" client machine in my case is the web server.
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Albe Laurenz
2012-05-04 11:08:18 UTC
Permalink
Post by Gnanakumar
Post by Albe Laurenz
SELECT application_name, client_addr, client_hostname, client_port
FROM pg_stat_activity
WHERE procpid = 14740;
(Replace 14740 of the process ID of the "idle in transaction"
backend).
Post by Gnanakumar
Post by Albe Laurenz
Look on the client machine and find the process that holds TCP port
"client_port" open (on Linux you can use "lsof" for that).
Ours is a web-based application and all calls are made "only" from where the
web server is running. No external calls are allowed in my case.
Hence,
Post by Gnanakumar
the "only" client machine in my case is the web server.
Then you need to identify the web server process/thread that has
client_port open and fix the problem there. If you get that process
to close the transaction or terminate, your problem is solved.
You can also terminate a database session from the server with
the pg_terminate_backend function.

The interesting thing to know would be where in your web application
a transaction got left open, so that it does not happen again.

Yours,
Laurenz Albe
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Gnanakumar
2012-05-04 11:32:27 UTC
Permalink
Post by Albe Laurenz
Then you need to identify the web server process/thread that has
client_port open and fix the problem there. If you get that process
to close the transaction or terminate, your problem is solved.
You can also terminate a database session from the server with
the pg_terminate_backend function.
Thanks for letting me know the existence of "pg_terminate_backend(pid int)"
function. Unfortunately, this is available only from v8.4 onwards. We're
running v8.2.22.
Post by Albe Laurenz
The interesting thing to know would be where in your web application
a transaction got left open, so that it does not happen again.
Sure, we too are hunting out for the same.
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
k***@rice.edu
2012-05-04 13:38:42 UTC
Permalink
Post by Gnanakumar
Post by Albe Laurenz
SELECT application_name, client_addr, client_hostname, client_port
FROM pg_stat_activity
WHERE procpid = 14740;
(Replace 14740 of the process ID of the "idle in transaction" backend).
Look on the client machine and find the process that holds TCP port
"client_port" open (on Linux you can use "lsof" for that).
Ours is a web-based application and all calls are made "only" from where the
web server is running. No external calls are allowed in my case. Hence,
the "only" client machine in my case is the web server.
You may also want to consider setting a statement_timeout to prevent this
until you can find the problem with the application.

Regards,
Ken
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Kevin Grittner
2012-05-04 13:57:08 UTC
Permalink
Post by k***@rice.edu
You may also want to consider setting a statement_timeout to
prevent this until you can find the problem with the application.
How would a statement timeout help close a transaction on an idle
connection? It's idle because no statements are being run.

-Kevin
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
k***@rice.edu
2012-05-04 15:05:53 UTC
Permalink
Post by Kevin Grittner
Post by k***@rice.edu
You may also want to consider setting a statement_timeout to
prevent this until you can find the problem with the application.
How would a statement timeout help close a transaction on an idle
connection? It's idle because no statements are being run.
-Kevin
Yes, you are correct. I misunderstood. Too bad we do not have a
transaction_timeout.

Regards,
Ken
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Thomas Markus
2012-05-08 06:24:41 UTC
Permalink
Hi Gnanam,

I assume your not ending your transactions with commit or rollback. When
using a connection pooler your connection is reused for next request. So
if one request does something that initiates an implizit BEGIN (like an
UPDATE/INSERT) without a COMMIT or ROLLBACK this connection remains in
IDLE IN TRANSACTION mode. Please ensure that you invoke a COMMIT after
each successful request or ROLLBACK after failed or control these inside
your application. I'm not familiar with pgpool but check if you can
configure your pooler for this behaviour

Thomas
Post by Gnanakumar
Hi,
Recently, in our Production server, we found a "single query" being held up
select date_trunc('second', current_timestamp - query_start) as runtime,
datname as database_name, current_query from pg_stat_activity where
current_query != '<IDLE>' order by 1 desc
but we're clueless which was the root cause of this issue and still hunting.
As we know, query output doesn't show up the actual query/statement.
<snip>
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Continue reading on narkive:
Loading...