Received: from localhost (unknown [200.46.204.184]) by postgresql.org (Postfix) with ESMTP id A66B72E335D for ; Fri, 14 Mar 2008 12:24:12 -0300 (ADT) Received: from postgresql.org ([200.46.204.71]) by localhost (mx1.hub.org [200.46.204.184]) (amavisd-maia, port 10024) with ESMTP id 89566-01-9 for ; Fri, 14 Mar 2008 12:23:59 -0300 (ADT) X-Greylist: from auto-whitelisted by SQLgrey-1.7.5 Received: from mail.transas.com (mail.transas.com [82.140.75.150]) by postgresql.org (Postfix) with ESMTP id 7D1B32E3163 for ; Fri, 14 Mar 2008 12:23:33 -0300 (ADT) Received: from sol.transas.com (sol.transas.com [10.10.0.20]) by mail.transas.com (Postfix) with ESMTP id 50C4E500714; Fri, 14 Mar 2008 18:23:22 +0300 (MSK) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C885E7.56CA5C26" X-MimeOLE: Produced By Microsoft Exchange V6.5 Subject: Re: pg_total_relation_size() and CHECKPOINT Date: Fri, 14 Mar 2008 18:23:22 +0300 Message-ID: <528853D3C5ED2C4AA8990B504BA7FB850106DF13@sol.transas.com> In-Reply-To: <439.1205439630@sss.pgh.pa.us> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [DOCS] pg_total_relation_size() and CHECKPOINT Thread-Index: AciFR8wzKZsRgQfoRO+rmBDARpbSCAAmDfdg From: "Zubkovsky, Sergey" To: "Tom Lane" Cc: X-Transas-MailScanner: Found to be clean X-Transas-MailScanner-From: sergey.zubkovsky@transas.com X-Transas-MailScanner-To: pgsql-docs@postgresql.org, tgl@sss.pgh.pa.us X-Virus-Scanned: Maia Mailguard 1.0.1 X-Archive-Number: 200803/42 X-Sequence-Number: 4854 This is a multi-part message in MIME format. ------_=_NextPart_001_01C885E7.56CA5C26 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi, =20 Here is my example. We are creating 2 tables: =20 create table t1 ( a int, b varchar( 30 ) ); create table t1_arh ( c text ); =20 and filling 't1': =20 insert into t1 select generate_series(1, 100000 ), generate_series(1, 100000 ); =20 The "arch_table_sp" user-function will be used for extracting data from 't1' and archiving it to 't1_arh'. =20 CREATE OR REPLACE FUNCTION "arch_table_sp" ( tblName name, arcTblName name ) RETURNS void AS $BODY$=20 DECLARE fn text; =20 chunk CONSTANT bigint :=3D 512*1024; off bigint :=3D 0; rdBytes bigint; buf text; =20 BEGIN SELECT setting INTO STRICT fn FROM pg_settings WHERE name =3D 'data_directory'; fn :=3D fn || '/tbldata.txt'; =20 PERFORM pg_file_unlink( fn ); =20 EXECUTE 'COPY ( SELECT * FROM ' || quote_ident( tblName ) || ' ) TO ' || quote_literal( fn ); =20 EXECUTE ' CREATE OR REPLACE FUNCTION "__InsertChunk__sp" ( data text ) RETURNS void AS $_$ INSERT INTO ' || quote_ident( arcTblName ) || ' ( c ) VALUES ( $1 ); $_$ LANGUAGE sql;'; =20 LOOP buf :=3D pg_file_read( fn, off, chunk ); =20 rdBytes :=3D length( buf ); =20 IF ( rdBytes > 0 ) THEN PERFORM "__InsertChunk__sp"( buf ); off :=3D off + rdBytes; END IF; =20 EXIT WHEN ( rdBytes <> chunk ); END LOOP; =20 PERFORM pg_file_unlink( fn ); END; $BODY$ LANGUAGE plpgsql; =20 Now we are executing the following statements in one transaction: =20 select "arch_table_sp"( 't1', 't1_arh' ); select pg_total_relation_size( 't1_arh' ); =20 The result is 417792 (in the general case it may be another value, for example, I received 303104, 573440 and etc). If we are executing these statements in separate transactions with a couple of seconds between them than we have received another value: =20 truncate table t1_arh; =20 select "arch_table_sp"( 't1', 't1_arh' ); =20 select pg_total_relation_size( 't1_arh' ); =20 The result is 688128! With explicit CHECKPOINT we will have one more value: =20 truncate table t1_arh; =20 select "arch_table_sp"( 't1', 't1_arh' ); CHECKPOINT; select pg_total_relation_size( 't1_arh' ); =20 The result is 696320! =20 It would be interesting why we have such results... =20 It's obviously that CHECKPOINT is not a good decision. Can you suggest some other approach instead of explicit CHECKPOINT? =20 Sorry for my English. I hope this example is quite clear. =20 =20 Thanks in advance, Zubkovsky Sergey =20 =20 -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us]=20 Sent: Thursday, March 13, 2008 11:21 PM To: Zubkovsky, Sergey Cc: pgsql-docs@postgresql.org Subject: Re: [DOCS] pg_total_relation_size() and CHECKPOINT =20 "Zubkovsky, Sergey" writes: > I've detected that a result value of pg_total_relation_size() for an > actively updated table might be significantly differ from a result that > is returned after explicit CHECKPOINT command execution. =20 Uh, can you show a specific example of what you mean? =20 > I understand the reasons of such behavior: cache buffers must be flushed > in order to be sure that pg_total_relation_size() result will be like we > expect. =20 I wouldn't think so. The space for a page is allocated immediately when needed --- its *contents* might not be up to date, but that shouldn't affect pg_total_relation_size. =20 regards, tom lane ------_=_NextPart_001_01C885E7.56CA5C26 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi,

 

Here is my = example.

We are creating 2 = tables:

 

create table t1 ( a int, b = varchar( 30 ) );

create table t1_arh ( c text = );

 

and filling = 't1':

 

insert into t1 select = generate_series(1, 100000 ), generate_series(1, 100000 );

 

The = "arch_table_sp" user-function will be used for extracting data from ‘t1’ and archiving it = to ‘t1_arh’.

 

CREATE OR REPLACE FUNCTION "arch_table_sp" ( tblName name, arcTblName name = )

RETURNS = void

AS $BODY$ =

DECLARE

    fn = text;

 

    chunk = CONSTANT bigint :=3D 512*1024;

    off bigint = :=3D 0;

    rdBytes = bigint;

    buf = text;

    =

BEGIN

    SELECT = setting INTO STRICT fn FROM pg_settings WHERE name =3D = 'data_directory';

    fn :=3D fn || '/tbldata.txt';

    =

    PERFORM = pg_file_unlink( fn );

    =

    EXECUTE 'COPY = ( SELECT * FROM ' || quote_ident( tblName ) || ' ) TO ' || quote_literal( fn = );

 

    EXECUTE = '

CREATE OR REPLACE FUNCTION "__InsertChunk__sp" ( data text )

  RETURNS void = AS

$_$

    INSERT INTO ' = || quote_ident( arcTblName ) || ' ( c ) VALUES ( $1 = );

$_$

  LANGUAGE = sql;';

 

    LOOP

        buf :=3D pg_file_read( fn, off, chunk );

 

        rdBytes :=3D length( buf );

 

        IF ( rdBytes > 0 ) THEN

       &nbs= p;    PERFORM "__InsertChunk__sp"( buf = );

       &nbs= p;    off :=3D off + rdBytes;

        END IF;

 

        EXIT WHEN ( rdBytes <> chunk );

    END = LOOP;

 

    PERFORM = pg_file_unlink( fn );

END;

$BODY$ LANGUAGE = plpgsql;

 

Now we are executing the = following statements in one transaction:

 

select = "arch_table_sp"( 't1', 't1_arh' );

select pg_total_relation_size( = 't1_arh' );

 

The result is 417792 (in = the general case it may be another value, for example, I received 303104, = 573440 and etc).

If we are executing these statements in separate transactions with a couple of seconds between = them than we have received another value:

 

truncate table = t1_arh;

 

select = "arch_table_sp"( 't1', 't1_arh' );

 

select pg_total_relation_size( = 't1_arh' );

 

The result is = 688128!

With explicit CHECKPOINT = we will have one more value:

 

truncate table = t1_arh;

 

select = "arch_table_sp"( 't1', 't1_arh' );

CHECKPOINT;

select pg_total_relation_size( = 't1_arh' );

 

The result is = 696320!

 

It would be interesting = why we have such results...

 

It’s obviously that = CHECKPOINT is not a good decision.

Can you suggest some = other approach instead of explicit CHECKPOINT?

 

Sorry for my = English.

I hope this example is = quite clear.

 

 

Thanks in = advance,

Zubkovsky = Sergey

 

 

-----Original Message-----
From: Tom = Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Thursday, March 13, 2008 11:21 PM
To: Zubkovsky, Sergey
Cc: pgsql-docs@postgresql.org
Subject: Re: [DOCS] pg_total_relation_size() and = CHECKPOINT

 

"Zubkovsky, = Sergey" <Sergey.Zubkovsky@transas.com> = writes:

> I've detected that a result value of = pg_total_relation_size() for an

> actively updated table might be significantly differ from a = result that

> is returned after explicit CHECKPOINT command = execution.

 

Uh, can you show a specific example of what you = mean?

 

> I understand the reasons of such behavior: cache buffers = must be flushed

> in order to be sure that pg_total_relation_size() result = will be like we

> expect.

 

I wouldn't think so.  The space for a page is allocated immediately when

needed --- its *contents* might not be up to date, but that = shouldn't

affect pg_total_relation_size.

 

           = ;       regards, tom lane

------_=_NextPart_001_01C885E7.56CA5C26--