Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.80) (envelope-from ) id 1ZKUZG-0007jE-Mz for pgsql-docs@arkaria.postgresql.org; Wed, 29 Jul 2015 16:50:58 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84) (envelope-from ) id 1ZKUZG-0007wm-9j for pgsql-docs@arkaria.postgresql.org; Wed, 29 Jul 2015 16:50:58 +0000 Received: from makus.postgresql.org ([2001:4800:1501:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84) (envelope-from ) id 1ZKUYv-0007H8-4q for pgsql-docs@postgresql.org; Wed, 29 Jul 2015 16:50:37 +0000 Received: from smtp-auth.no-ip.com ([8.23.224.61] helo=out.smtp-auth.no-ip.com) by makus.postgresql.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84) (envelope-from ) id 1ZKUYq-0008P1-WB for pgsql-docs@postgresql.org; Wed, 29 Jul 2015 16:50:35 +0000 X-No-IP: alvh.no-ip.org@noip-smtp X-Report-Spam-To: abuse@no-ip.com Received: from alvin.alvh.no-ip.org (unknown [181.226.142.114]) (Authenticated sender: alvh.no-ip.org@noip-smtp) by smtp-auth.no-ip.com (Postfix) with ESMTPA id 3C61D400E12; Wed, 29 Jul 2015 09:50:19 -0700 (PDT) Received: by alvin.alvh.no-ip.org (Postfix, from userid 1000) id D5CDA333; Wed, 29 Jul 2015 13:50:15 -0300 (CLT) Date: Wed, 29 Jul 2015 13:50:15 -0300 From: Alvaro Herrera To: Craig Ringer Cc: pgsql-docs@postgresql.org, Robert Haas Subject: Re: bgworker / SPI docs patches Message-ID: <20150729165015.GC2437@postgresql.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Pg-Spam-Score: 0.4 (/) List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-docs Precedence: bulk Sender: pgsql-docs-owner@postgresql.org I wrote this a very long time ago but apparently it was eaten by a spam filter before it reached the lists. Resending. Craig Ringer wrote: > From c73d1303cb3474ccc799eeda252eeef1fc3d9026 Mon Sep 17 00:00:00 2001 > From: Craig Ringer > Date: Mon, 16 Feb 2015 18:17:48 +1300 > Subject: [PATCH 1/4] Add xreflabel to most chapter entries So I think this is pretty neat as a general idea, but there was opposition in that printed materials would no longer have a useful pointer. If we could tweak the output so that it says something like "Chapter 33, Herding Elephants" instead of just "Chapter 33" (current) or just "Herding Elephants" (your proposal), I think the whole patch would be great. Now I don't really know how to do this; at the very least we will need to edit stylesheet.dsl and stylesheet.xsl, I think. So dbl1en.dsl defines en-xref-strings for chapters like (list (normalize "chapter") (if %chapter-autolabel% "&Chapter; %n" "the &chapter; called %t")) if we could change that to, say, "&Chapter; %n (“%t”)" which I think would render as Chapter 33 (“Herding Elephants”) we'd be done in the DSSSL side. No idea how to actually do it, though ... Peter, any thoughts? In XSL, there's xref-number which is presumably what we use today, but there's also xref-number-and-title. I think it's just a matter of changing xrefstyle. > From 6e2103b211aee44780c6106e307ea4a561d11fde Mon Sep 17 00:00:00 2001 > From: Craig Ringer > Date: Mon, 16 Feb 2015 18:17:59 +1300 > Subject: [PATCH 2/4] Document how to build with the website style +1 for this. > diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml > index 6eada92..6437ab1 100644 > --- a/doc/src/sgml/xfunc.sgml > +++ b/doc/src/sgml/xfunc.sgml > @@ -3307,6 +3307,15 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray > > Shared Memory and LWLocks > > + > + Shared memory in extensions > + I would do this as shared memoryin extensions instead. > + > + LWLocks (extensions) > + Lightweight locks (extensions) > + I think this is two entries, lightweight locks in extensions lwlocks lightweight locks > From 04e1cbe49268e22886e2b30fe412fef0cb9dcc65 Mon Sep 17 00:00:00 2001 > From: Craig Ringer > Date: Mon, 16 Feb 2015 18:37:19 +1300 > Subject: [PATCH 4/4] Document SPI use from bgworkers in greater detail This is the meat of the submission, isn't it. > diff --git a/doc/src/sgml/bgworker.sgml b/doc/src/sgml/bgworker.sgml > index ef28f72..d0e7dc0 100644 > --- a/doc/src/sgml/bgworker.sgml > +++ b/doc/src/sgml/bgworker.sgml > @@ -34,18 +34,21 @@ > PostgreSQL is started by including the module name in > shared_preload_libraries. A module wishing to run a background > worker can register it by calling > + RegisterBackgroundWorker > RegisterBackgroundWorker(BackgroundWorker *worker) > from its _PG_init(). Background workers can also be started > after the system is up and running by calling the function > + RegisterDynamicBackgroundWorker > RegisterDynamicBackgroundWorker(BackgroundWorker > *worker, BackgroundWorkerHandle **handle). Unlike > RegisterBackgroundWorker, which can only be called from within > the postmaster, RegisterDynamicBackgroundWorker must be > - called from a regular backend. > + called from a regular backend (which might be another background worker). > I don't think we typically attach tags within paragraphs; most commonly they are outside or other block items. I imagine this works fine, but it's probably best to continue tradition unless there's a specific reason to do otherwise. > > bgw_flags is a bitwise-or'd bit mask indicating the > - capabilities that the module wants. Possible values are > - BGWORKER_SHMEM_ACCESS (requesting shared memory access) > - and BGWORKER_BACKEND_DATABASE_CONNECTION (requesting the > - ability to establish a database connection, through which it can later run > - transactions and queries). A background worker using > - BGWORKER_BACKEND_DATABASE_CONNECTION to connect to > - a database must also attach shared memory using > - BGWORKER_SHMEM_ACCESS, or worker start-up will fail. > + capabilities that the module wants. Possible values are: > + Using a rather than prose seems much better to me. > > - bgw_main is a pointer to the function to run when > - the process is started. This function must take a single argument of type > - Datum and return void. > - bgw_main_arg will be passed to it as its only > - argument. Note that the global variable MyBgworkerEntry > - points to a copy of the BackgroundWorker structure > - passed at registration time. bgw_main may be > - NULL; in that case, bgw_library_name and > - bgw_function_name will be used to determine > - the entry point. This is useful for background workers launched after > - postmaster startup, where the postmaster does not have the requisite > - library loaded. > + bgw_main specifies the function to run when the > + background worker is started. If non-NULL, > + bgw_main will take precedence over > + bgw_library_name and > + bgw_function_name. > + > + > + Use of this field is deprecated. It should be set to > + NULL then bgw_library_name > + and bgw_function_name should be used instead. > + > + > + Specifying the background worker entry point using a function pointer will > + not work correctly on Windows (or with EXEC_BACKEND > + defined). It may also malfunction when used to point to a function in a > + dynamically loaded shared library used by a dynamic background worker. > + > + > Interesting. Given the drawbacks of the old mechanism, I am okay with this change. In fact, why don't we remove bgw_main completely? > + > bgw_notify_pid is the PID of a PostgreSQL > backend process to which the postmaster should send SIGUSR1 > - when the process is started or exits. It should be 0 for workers registered > - at postmaster startup time, or when the backend registering the worker does > + when the process is started or exits, causing the notified process's latch > + to be set. It should be 0 for workers registered > + at postmaster startup time, when the backend registering the worker does > not wish to wait for the worker to start up. Otherwise, it should be > initialized to MyProcPid. > I'm not sure about documenting that SIGUSR1 will set a process' latch. That can be changed in that process' signal handler. Maybe this is fine if we add the word "typically" somewhere. > diff --git a/doc/src/sgml/example-worker-spi.sgml b/doc/src/sgml/example-worker-spi.sgml > new file mode 100644 > index 0000000..63d8f3d > --- /dev/null > +++ b/doc/src/sgml/example-worker-spi.sgml > @@ -0,0 +1,26 @@ > + > + > + > + worker-spi example > + > + > + Background worker with SPI > + SPI in background worker > + I don't think this module really needs a doc page, but okay. In any case, I think these aren't good; one of them should be split in and the other one should probably be a . > @@ -4092,6 +4097,143 @@ INSERT INTO a SELECT * FROM a; > > > > + > + Using the SPI from background workers > + > + > + Using the SPI from background workers > + requires some additional steps that are not required from SPI-using > + procedures invoked via SQL from normal user backends. When using the SPI > + from an SQL-callable function it is safe to assume that a transaction is > + already open, there is already a snapshot, and that PostgreSQL's statistics > + have been updated. None of these things may be assumed for use of the SPI > + from a bgworker. > + > + > + > + To safely use the SPI from a bgworker you must ensure that: > + > + A transaction is open (in progress) > + The SPI is connected > + There is an active snapshot or read_only is set to false in SPI calls > + > + You should also update the statement start and end timestamps for use in > + pg_stat_activity using pgstat_report_activity. > + > + > + > + Attempting to use the SPI without an open transaction or using the SPI > + in read_only mode without setting a snapshot will > + generally not cause obvious failures or, in a > + --enable-cassert build, assertions. Instead it will > + usually appear to work for simple cases but may produce incorrect > + results or be unstable. Users of background workers should generally > + prefix SPI use with > + > +Assert(IsTransactionState()); > + > + if the function does not its self establish a transaction. > + +1 to the general idea of adding this section. "its self"? > + > + Typical SPI use from a background worker involves setting up a transaction, > + setting the statement start timestamp, connecting to the SPI, updating > + pg_stat_activity, establishing a snapshot, running the query/queries, > + then performing matching cleanup actions. For example: > + Can't we just refer to the source code of worker_spi? That's the whole point of that module. Maybe we can add lots more comment to that, to explain the necessity of each line there. > + > + In more complex cases the function may need to check if a transaction is > + already open with IsTransactionState() and only perform > + transaction management if there is no current transaction. If the SPI may > + already connected by the caller the function should use > + SPI_push_conditional before SPI_connect and > + SPI_pop_conditional after SPI_finish. Snapshot > + state may also be conditionally handled. The resulting procedure will > + look more like: If we need a more complex example in worker_spi, let's add one. > diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml > index 6437ab1..39e6a15 100644 > --- a/doc/src/sgml/xfunc.sgml > +++ b/doc/src/sgml/xfunc.sgml > @@ -3357,6 +3357,60 @@ if (!ptr) > } > > > + > + > + Extensions registered to run at startup > + using should instead install a > + shmem_startup_hook from _PG_init. This will be > + invoked once shared memory is ready but before user backends or background > + workers get launched. The hook function invoked should use a pattern like > + that shown for shared memory initialization above then call the next hook > + function, e.g. declare: > + > +static shmem_startup_hook_type prev_shmem_startup_hook = NULL; > + > + then from _PG_init, after any > + RequestAddinShmemSpace and RequestAddinLWLocks > + install the hook: > + > + prev_shmem_startup_hook = shmem_startup_hook; > + shmem_startup_hook = my_worker_shmem_startup; > + > + where my_worker_shmem_startup is a void function > + with void arguments that calls the next hook (if any): > + > + if (prev_shmem_startup_hook != NULL) > + prev_shmem_startup_hook(); > + > + then calls ShmemInitStruct per the example shown above. An > + example of this usage can be found in the initialisation of the + linkend="pgstatstatements"> extension. > + No problem with this. > + > + Shared memory is zeroed on postmaster restart > + > + If the postmaster restarts - usually due to the crash of a backend > + process like a user backend or a background worker - then it zeroes > + shared memory on restart then re-invokes any shared memory setup hooks > + that have been installed by extensions. Extensions using shared memory > + must be prepared for this. > + > + > + Background workers are not unregistered if the postmaster restarts > + due to the crash of a background worker or user backend. This can lead > + to background workers accessing shared memory that has been reinitialized. > + Care must be taken to ensure that initialisation of shared memory produces > + stable results when re-run or workers must detect the postmaster restart > + using a generation counter and exit. In particular if parameters are being > + passed to a bgworker via shared memory the background worker > + bgw_main_arg should always be an index into an > + array allocated in shared memory, rather than a pointer, and repeated > + initialisations of the shared memory segment must produce the same range > + of valid indexes in the same order. > + > + > + > Hm. Aren't all bgworkers also restarted if postmaster power-cycles everything due to a crash in any process? ISTM that they should be, and if not, that's a bug. Please use — for long dashes, but I think the whole block should go away. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-docs