Discussion:
[ADMIN] Question about ALTER TABLE DROP CONSTRAINT on 9.1 -- psql crashes
(too old to reply)
Tom Lane
2012-08-10 00:19:44 UTC
Permalink
We have a small database (few hundred megs of data, lass than half that in indexes) that suffers from index bloat. Currently we handle this with an hourly REINDEX command. This works but causes some small issues, so I have been expploring other methods.
usher=# BEGIN;
BEGIN
gusher=# ALTER TABLE ourcodes DROP CONSTRAINT IF EXISTS ourcodes_pkey CASCADE;
ALTER TABLE
gusher=# ALTER TABLE ourcodes ADD CONSTRAINT ourcodes_pkey PRIMARY KEY (id);
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index "ourcodes_pkey" for table "ourcodes"
ALTER TABLE
gusher=# COMMIT;
FATAL: terminating connection due to administrator command
WTF? That should certainly not happen. Have you perhaps got a script
that runs around sending SIGTERM to backends that it thinks are blocking
something? Does anything show up in the postmaster log when this
happens?

regards, tom lane
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Tom Lane
2012-08-10 02:06:57 UTC
Permalink
Every minute on the minute there is a cronjob that kills IDLE IN CONNECTION procs older than 3 seconds ... possibly that process killed the process I was running, but to me the user it seemed as if the COMMIT came before the disconnect.
That sure sounds like the source of your problem. It might be sane if
it killed only processes that *had been idle* for at least three
seconds, but I'm not sure there is any easy way to determine that ...

regards, tom lane
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Craig Ringer
2012-08-10 02:24:40 UTC
Permalink
Post by Tom Lane
Every minute on the minute there is a cronjob that kills IDLE IN CONNECTION procs older than 3 seconds ... possibly that process killed the process I was running, but to me the user it seemed as if the COMMIT came before the disconnect.
That sure sounds like the source of your problem. It might be sane if
it killed only processes that *had been idle* for at least three
seconds, but I'm not sure there is any easy way to determine that ...
wouldn't:

select * from pg_stat_activity
where current_query = '<IDLE> in transaction'
AND query_start < current_timestamp - INTERVAL '3 seconds';

do it?

--
Craig Ringer
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Tom Lane
2012-08-10 04:20:48 UTC
Permalink
Post by Craig Ringer
Post by Tom Lane
That sure sounds like the source of your problem. It might be sane if
it killed only processes that *had been idle* for at least three
seconds, but I'm not sure there is any easy way to determine that ...
select * from pg_stat_activity
where current_query = '<IDLE> in transaction'
AND query_start < current_timestamp - INTERVAL '3 seconds';
do it?
No, that would find sessions that were idle and whose last command
started at least 3 seconds ago. But it might have completed only
microseconds ago. The symptoms Greg is describing are consistent
with this kill script running during a short interval between his
index-build command and his COMMIT.

As of 9.2 there's a "state_change" timestamp column in pg_stat_activity
that would provide a safer check, but it's not in existing releases ...

regards, tom lane
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Craig Ringer
2012-08-10 05:08:24 UTC
Permalink
Post by Tom Lane
Post by Craig Ringer
Post by Tom Lane
That sure sounds like the source of your problem. It might be sane if
it killed only processes that *had been idle* for at least three
seconds, but I'm not sure there is any easy way to determine that ...
select * from pg_stat_activity
where current_query = '<IDLE> in transaction'
AND query_start < current_timestamp - INTERVAL '3 seconds';
do it?
No, that would find sessions that were idle and whose last command
started at least 3 seconds ago. But it might have completed only
microseconds ago. The symptoms Greg is describing are consistent
with this kill script running during a short interval between his
index-build command and his COMMIT.
Eh. For some reason I thought query_start was updated whenever
current_query was set, so it'd be updated when the session went idle.
Nonesensical in retrospect.

Does this approach look sane? Note idlers and then reap them if they
haven't started new queries next pass:

CREATE UNLOGGED TABLE reap (
procpid integer,
query_start timestamptz,
added timestamptz
);

-- Run every 20s or whatever to note idlers and when we noticed them
insert into reap (procpid,query_start,added)
select procpid, query_start, current_timestamp
from pg_stat_activity where current_query = '<IDLE> in transaction';

-- Clean out old entries and kill idlers
WITH r AS (
DELETE FROM reap r2
WHERE r2.added + INTERVAL '10 seconds' < current_timestamp
RETURNING *
)
SELECT r.procpid, pg_terminate_backend(r.procpid)
FROM r INNER JOIN pg_stat_activity s
ON (r.procpid = s.procpid AND r.query_start = s.query_start )
GROUP BY r.procpid;

ie "where we've noted more than 1 minute ago that a pid was idle, and
query_start hasn't changed so no new query has been run, kill the backend".

Of course, there's race where the backend could be starting a new query
even as you kill it for idleness, but that's true in any
idle-in-transaction control scheme (other than having a true transaction
timeout feature *within* the backend).

--
Craig Ringer
--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Loading...