[ADMIN] constraint triggers

Discussion:

(too old to reply)

Maria L. Wilson

2011-09-28 12:54:35 UTC

Hi all....

I would like to start a dialogue and hear general feedback about the use
of constraint triggers in postgres (8.4.5).

Our overall issue is that using general triggers is causing slow inserts
(from locking issues) in our database. Here are some details:

A little background (jboss/j2ee/hibernate/linux).....
We have 3 basic triggers on a particular database table - one for
inserts, one for updates & another for deletes and they keep track of a
"granule count" that is used in reporting. This field (gracount) is
stored in another table called dataset. An example of the insert
trigger/function is as follows:

----------------------
CREATE TRIGGER increment_dataset_granule_count
AFTER INSERT
ON inventory
FOR EACH ROW
EXECUTE PROCEDURE increment_dataset_granule_count();

CREATE OR REPLACE FUNCTION increment_dataset_granule_count()
RETURNS trigger AS
$BODY$
DECLARE
BEGIN
IF NEW.visible_id != 5 THEN
UPDATE dataset
SET gracount = gracount + 1
WHERE dataset.inv_id = NEW.inv_id;
END IF;
RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION increment_dataset_granule_count() OWNER TO jboss;
-----------------------

What we found was that when these triggers were fired we saw locking
issues that slowed down performance dramatically to inserts into the
inventory table (the table where the triggers are located). You could
see the inserts sit and wait by looking at the pg_stat_activity table.

Within our app, the trigger was invoked within the same hibernate
transaction that a stateless session bean was using to persist/merge the
granule (inventory). Subsequently, in the same transaction, a EJB MBean
was merging the dataset changes creating kind of a "dead lock" on the
dataset table.

Our first try to solve this problem has been to convert these triggers
into a constraint trigger which allows for DEFERRABLE INITIALLY DEFERRED
flags. This, we are finding, is forcing the trigger function to run
after the triggering transaction is completed. We believe this will fix
our locking problem and hopefully speed up our inserts again.

Any comments or past experiences would certainly be helpful!

thanks,

Maria Wilson
NASA/Langley Research Center
Hampton, Virginia 23681

Gilberto Castillo Martínez

2011-09-28 14:26:29 UTC

Permalink

Post by Maria L. Wilson
Hi all....
I would like to start a dialogue and hear general feedback about the
use of constraint triggers in postgres (8.4.5).
Our overall issue is that using general triggers is causing slow
A little background (jboss/j2ee/hibernate/linux).....
We have 3 basic triggers on a particular database table - one for
inserts, one for updates & another for deletes and they keep track of
a "granule count" that is used in reporting. This field (gracount) is
stored in another table called dataset. An example of the insert
----------------------
CREATE TRIGGER increment_dataset_granule_count
AFTER INSERT
ON inventory
FOR EACH ROW
EXECUTE PROCEDURE increment_dataset_granule_count();
CREATE OR REPLACE FUNCTION increment_dataset_granule_count()
RETURNS trigger AS
$BODY$
DECLARE
BEGIN
IF NEW.visible_id != 5 THEN
UPDATE dataset
SET gracount = gracount + 1
WHERE dataset.inv_id = NEW.inv_id;
END IF;
RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION increment_dataset_granule_count() OWNER TO jboss;
-----------------------

Replace RETURN NULL for RETURN NEW

--
Saludos,
Gilberto Castillo
Edificio Beijing. Miramar Trade Center. Etecsa.
Miramar, La Habana.Cuba.

Kevin Grittner

2011-09-28 14:43:38 UTC

Permalink

Post by Gilberto Castillo MartÃnez

Post by Maria L. Wilson
CREATE OR REPLACE FUNCTION increment_dataset_granule_count()
RETURN NULL;

Replace RETURN NULL for RETURN NEW

That won't matter for an AFTER trigger:

http://www.postgresql.org/docs/9.0/interactive/trigger-definition.html

says:

| The return value is ignored for row-level triggers fired after an
| operation, and so they can return NULL.

-Kevin

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Craig Ringer

2011-09-29 02:39:55 UTC

Permalink

Post by Maria L. Wilson
UPDATE dataset
SET gracount = gracount + 1
WHERE dataset.inv_id = NEW.inv_id;

That'll serialize all transactions that touch the same inv_id, so only
one may run at once. The second and subsequent ones will block waiting
for an update lock on the `dataset' tuple for `inv_id'.

When you think about it that's necessary to prevent a wrong result when
transaction A then B run this statement, transaction B commits, and
transaction A rolls back. What's the correct answer?

To fix this you'll need to change how you maintain your `dataset' table.
Exactly how depends on your requirements.

You can trade read performance off against write performance by
INSERTing new rows instead of UPDATEing them, so you do a:

SELECT count(inv_id) FROM dataset WHERE dataset.inv_id = ?

instead of a

SELECT gracount FROM dataset WHERE dataset.inv_id = ?

to retrieve your count.

You can have both fast reads and fast writes if you accept potentially
out-of-date `gracount' data, maintaining `gracount' as a materialized
view that you refresh using LISTEN/NOTIFY *after* a transaction commits.
It's possible for it to be a little out of date, but writers no longer
interfere with each other and readers no longer have to re-do the
counting/aggregation work.

You can live with serializing writes like you currently do in exchange
for the greater read performance of maintaining counts. This may not be
so bad once you understand what's happening and can keep transactions
that trigger this problem short, preventing them from blocking others
while they do unrelated work.

In the end, this looks like a simple problem but it isn't when you
consider the possibility of transactions rolling back.

Post by Maria L. Wilson
Our first try to solve this problem has been to convert these triggers
into a constraint trigger which allows for DEFERRABLE INITIALLY DEFERRED
flags. This, we are finding, is forcing the trigger function to run
after the triggering transaction is completed. We believe this will fix
our locking problem and hopefully speed up our inserts again.

That should help, but it's a form of trading timeliness off against
performance. Queries within the same transaction won't see the updated
`dataset' values, so if you're relying on them for correct operation
later in the transaction you could have issues.

--
Craig Ringer

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Maria L. Wilson

2011-09-29 16:24:13 UTC

Permalink

thanks Craig - we are doing some testing with some of your ideas
now.... Hopefully we can get this solved so using triggers doesn't
cause so much contention. I also understand that foreign keys can also
cause some of these issues. Do you have any thoughts on that?

thanks again, Maria Wilson

Post by Craig Ringer

Post by Maria L. Wilson
UPDATE dataset
SET gracount = gracount + 1
WHERE dataset.inv_id = NEW.inv_id;

That'll serialize all transactions that touch the same inv_id, so only
one may run at once. The second and subsequent ones will block waiting
for an update lock on the `dataset' tuple for `inv_id'.
When you think about it that's necessary to prevent a wrong result when
transaction A then B run this statement, transaction B commits, and
transaction A rolls back. What's the correct answer?
To fix this you'll need to change how you maintain your `dataset' table.
Exactly how depends on your requirements.
You can trade read performance off against write performance by
SELECT count(inv_id) FROM dataset WHERE dataset.inv_id = ?
instead of a
SELECT gracount FROM dataset WHERE dataset.inv_id = ?
to retrieve your count.
You can have both fast reads and fast writes if you accept potentially
out-of-date `gracount' data, maintaining `gracount' as a materialized
view that you refresh using LISTEN/NOTIFY *after* a transaction commits.
It's possible for it to be a little out of date, but writers no longer
interfere with each other and readers no longer have to re-do the
counting/aggregation work.
You can live with serializing writes like you currently do in exchange
for the greater read performance of maintaining counts. This may not be
so bad once you understand what's happening and can keep transactions
that trigger this problem short, preventing them from blocking others
while they do unrelated work.
In the end, this looks like a simple problem but it isn't when you
consider the possibility of transactions rolling back.

--
Sent via pgsql-admin mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin