Discussion:
diskspace
(too old to reply)
Rosser Schwarz
2013-02-05 19:36:45 UTC
Permalink
Additionally, postgres has a number of "metadata" columns (e.g., xmin,
xmax, cmin, cmax, &c). Those can add up, particularly when their net size
is greater than the user data size of a row.

rls
I'm trying to migrate a database from MySQL to postgres and I'm
struggling with the amount of
diskspace the resulting db takes.
I may be missing a setting somewhere but I can't see one anywhere
obvious (apologies if I'm being
stupid...)
Even postgres' idea of the size of the columns don't match up to its
own reported size of the data.
CREATE TABLE result (
st_id integer NOT NULL,
log smallint NOT NULL,
"time" integer NOT NULL,
token character(4) NOT NULL,
data character varying(500) DEFAULT NULL::character varying
);
# SELECT pg_size_pretty(sum(pg_column_size(data) +
pg_column_size(st_id) + pg_column_size(log) +
pg_column_size(token) + pg_column_size(time))) FROM result;
pg_size_pretty
----------------
178 MB
(1 row)
# SELECT pg_size_pretty(pg_relation_size('result'));
pg_size_pretty
----------------
613 MB
(1 row)
I'd naively expected these two figures to be similar.
I've run vacuum analyze and it made no difference (not a major surprise
because all I've done so far
is create the database and sequentially insert the data into the
tables).
I expected a little overhead from what I'd read before the migration
but that's a fairly huge
difference.
As I said, sorry if I've missed the obvious "use loads of extra space"
setting but I'd appreciate any
suggestion as to what that setting might be called :)
I don't think that pg_column_size() is a good tool to
measure table size.
I'd suggest that you use pg_table_size for the table itself
and pg_indexes_size for the size of ist indexes.
That should come close to the amount of disk space taken.
Agreed, don't forget, you have indexes, free space, vacuum-able stuff,
etc... all laying in your datafiles. Your measurements are telling you
what you have purely in a raw form.
--Scott Mead
http://www.openscg.com
Yours,
Laurenz Albe
--
http://www.postgresql.org/mailpref/pgsql-admin
--
:wq
Geoff Winkless
2013-02-05 20:33:34 UTC
Permalink
Thanks for all the responses....
Post by Rosser Schwarz
Agreed, don't forget, you have indexes, free space, vacuum-able stuff,
etc... all laying in your datafiles. Your measurements are telling you
what you have purely in a raw form.
Indexes are stored separately. There should be no free space (why would
there be? FILLFACTOR is 100!) and I've vacuumed already.

Additionally, postgres has a number of "metadata" columns (e.g., xmin,
Post by Rosser Schwarz
xmax, cmin, cmax, &c). Those can add up, particularly when their net size
is greater than the user data size of a row.
That's a fair point.

611MB (actual disk size) - 138MB (summed data size) = 483MB
11M rows (ish).

That's nearly 48 bytes extra per row.

http://www.postgresql.org/docs/9.2/static/storage-page-layout.html gives
detail...

Let's say around 249MB (23 bytes per row, according to that page) for the
columns you mention, so that leaves 234MB unexplained.

I can see 44 bytes per page header (given 5 columns, so 20 bytes
ItemIdData, with 24 bytes PageHeaderData). Given page size of 8kb that
would imply page headers of about 3.3MB over a 611MB table.

Even with some bytes for alignment I'm Still not seeing where 230MB (more
than the space taken by the data itself) of my disk is gone.

Thanks!

Geoff

Loading...