public inbox for [email protected]  
help / color / mirror / Atom feed
Improve warnings around CREATE INDEX CONCURRENTLY
10+ messages / 4 participants
[nested] [flat]

* Improve warnings around CREATE INDEX CONCURRENTLY
@ 2011-05-24 07:56 Greg Smith <[email protected]>
  2011-05-24 20:48 ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Greg Smith @ 2011-05-24 07:56 UTC (permalink / raw)
  To: pgsql-docs

When running CREATE INDEX CONCURRENTLY, the DefineIndex() code in 
src/backend/commands/indexcmds.c does a few things that one would expect 
from the documentation.  And then at the end it executes code described 
like this:

"The index is now valid in the sense that it contains all currently 
interesting tuples.  But since it might not contain tuples deleted just 
before the reference snap was taken, we have to wait out any 
transactions that might have older snapshots.  Obtain a list of VXIDs of 
such transactions, and wait for them individually."

It's possible to end up with a long series in pg_locks waiting for 
virtualxid entries at this point, for as long as some set of giant 
queries takes to execute, and you'll only see them one at a time.  The 
documentation warns:

"PostgreSQL must perform two scans of the table, and in addition it must 
wait for all existing transactions that could potentially use the index 
to terminate."

That's correct, but easy to read the wrong way.  I always assumed that 
this meant it was going to wait behind anything that had a shared lock 
or such on the table, things that had already accessed it.  This is the 
case with some earlier parts of this same code path.  But when it comes 
to the end here, the scope is actually broader than that.  And since 
there's a session-level lock on the table the whole time this wait loop 
is executing, that makes considerable secondary havoc possible here.  
You can end up waiting an unbounded amount of time for some long-running 
transaction, one not even expected to enter into the rebuild work, to 
finish, which leaves you no idea what's happening unless you know just 
what to look for.  (Watching such havoc actually happen is what prompted 
this investigation)

What makes it worse is that the wait shows up as a virtualxid one, which 
doesn't pop up on many common samples of things to look for in 
pg_locks.  It would be reasonable but also incorrect for admins to 
assume a table one would be visible if running into the case alluded to 
in the docs.  The serial way locks are obtained is unexpected too.

Attached patch expands the warnings around this command to reflect both 
issues:

-The waiting time is not necessarily limited to just things that involve 
the table
-The locks it obtains while running this phase of the rebuild are unusual

-- 
Greg Smith   2ndQuadrant US    [email protected]   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us




Attachments:

  [text/x-patch] concurrent-index-wait.patch (1.3K, 2-concurrent-index-wait.patch)
  download | inline diff:
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 43b6499..cb3334b 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -381,7 +381,17 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ <replaceable class="parameter">name</
    <para>
     In a concurrent index build, the index is actually entered into the
     system catalogs in one transaction, then the two table scans occur in a
-    second and third transaction.
+    second and third transaction.  All active transactions at the time the
+    second table scan starts, not just ones that already involve the table,
+    have the potential to block the concurrent index creation for however
+    long it takes for them to finish.  When checking for transactions that
+    could still use the original index, concurrent index creation advances
+    through potentially interfering older transactions one at a time,
+    obtaining locks on their virtual transaction identifiers to wait for
+    them to complete.
+   </para>
+
+   <para>
     If a problem arises while scanning the table, such as a
     uniqueness violation in a unique index, the <command>CREATE INDEX</>
     command will fail but leave behind an <quote>invalid</> index. This index


^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Improve warnings around CREATE INDEX CONCURRENTLY
  2011-05-24 07:56 Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
@ 2011-05-24 20:48 ` Alvaro Herrera <[email protected]>
  2011-05-24 21:06   ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Alvaro Herrera @ 2011-05-24 20:48 UTC (permalink / raw)
  To: Greg Smith <[email protected]>; +Cc: pgsql-docs

Excerpts from Greg Smith's message of mar may 24 03:56:59 -0400 2011:

> What makes it worse is that the wait shows up as a virtualxid one, which 
> doesn't pop up on many common samples of things to look for in 
> pg_locks.  It would be reasonable but also incorrect for admins to 
> assume a table one would be visible if running into the case alluded to 
> in the docs.  The serial way locks are obtained is unexpected too.

Incidentally, this is one of the things that Jim Nasby wanted exposed
somehow so that these problems are more easily diagnosed.  I dropped the
ball on that one, but I'll be picking it up again at some point.

-- 
Álvaro Herrera <[email protected]>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support



^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Improve warnings around CREATE INDEX CONCURRENTLY
  2011-05-24 07:56 Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-24 20:48 ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
@ 2011-05-24 21:06   ` Greg Smith <[email protected]>
  2011-05-25 17:57     ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Greg Smith @ 2011-05-24 21:06 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: pgsql-docs

On 05/24/2011 04:48 PM, Alvaro Herrera wrote:
> Excerpts from Greg Smith's message of mar may 24 03:56:59 -0400 2011:
>
>    
>> What makes it worse is that the wait shows up as a virtualxid one, which
>> doesn't pop up on many common samples of things to look for in
>> pg_locks.  It would be reasonable but also incorrect for admins to
>> assume a table one would be visible if running into the case alluded to
>> in the docs.  The serial way locks are obtained is unexpected too.
>>      
> Incidentally, this is one of the things that Jim Nasby wanted exposed
> somehow so that these problems are more easily diagnosed.  I dropped the
> ball on that one, but I'll be picking it up again at some point.
>    

I don't remember seeing anything about that before, but then again I 
wasn't really looking for it until now.  Did you have a basic idea what 
you wanted to do there?  I have enough of this work queued up now that I 
may end up poking at the code myself here soon.

The first thing I was considering was dropping some DEBUG2 level logging 
about the various phases of the reindex into there.  Given that 
concurrent index creation has all these single instance requirements and 
long runtimes, I find myself putting them into scripts that do all the 
work as background processes.  If I could set client_min_messages=debug2 
when starting the script, and see more info about the progress as it 
happens appear in the script's output log, that is something I think 
many admins would choose to do.

Not the ideal UI for exposing this info, certainly.  But a really easy 
one to add, and realistically I think it would be enough to resolve most 
of the transparency complaints here.  The biggest problem is not even 
documenting where people should be looking toward suspiciously, which I 
think the doc patch I submitted helps with.

-- 
Greg Smith   2ndQuadrant US    [email protected]   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Improve warnings around CREATE INDEX CONCURRENTLY
  2011-05-24 07:56 Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-24 20:48 ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  2011-05-24 21:06   ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
@ 2011-05-25 17:57     ` Alvaro Herrera <[email protected]>
  2011-05-25 18:24       ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Simon Riggs <[email protected]>
  2011-05-25 21:04       ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  0 siblings, 2 replies; 10+ messages in thread

From: Alvaro Herrera @ 2011-05-25 17:57 UTC (permalink / raw)
  To: Greg Smith <[email protected]>; +Cc: pgsql-docs

Excerpts from Greg Smith's message of mar may 24 17:06:07 -0400 2011:
> On 05/24/2011 04:48 PM, Alvaro Herrera wrote:

> > Incidentally, this is one of the things that Jim Nasby wanted exposed
> > somehow so that these problems are more easily diagnosed.  I dropped the
> > ball on that one, but I'll be picking it up again at some point.
> 
> I don't remember seeing anything about that before, but then again I 
> wasn't really looking for it until now.  Did you have a basic idea what 
> you wanted to do there?  I have enough of this work queued up now that I 
> may end up poking at the code myself here soon.

Please see this thread:
http://archives.postgresql.org/message-id/[email protected]

> The first thing I was considering was dropping some DEBUG2 level logging 
> about the various phases of the reindex into there.

That would take this system more or less to where autovacuum was in 8.1
(well, not quite that bad because you can change it locally, but still).
It seems to me we can do something a bit better than that.

-- 
Álvaro Herrera <[email protected]>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support



^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Improve warnings around CREATE INDEX CONCURRENTLY
  2011-05-24 07:56 Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-24 20:48 ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  2011-05-24 21:06   ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-25 17:57     ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
@ 2011-05-25 18:24       ` Simon Riggs <[email protected]>
  1 sibling, 0 replies; 10+ messages in thread

From: Simon Riggs @ 2011-05-25 18:24 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Greg Smith <[email protected]>; pgsql-docs

On Wed, May 25, 2011 at 6:57 PM, Alvaro Herrera
<[email protected]> wrote:
> Excerpts from Greg Smith's message of mar may 24 17:06:07 -0400 2011:
>> On 05/24/2011 04:48 PM, Alvaro Herrera wrote:
>
>> > Incidentally, this is one of the things that Jim Nasby wanted exposed
>> > somehow so that these problems are more easily diagnosed.  I dropped the
>> > ball on that one, but I'll be picking it up again at some point.
>>
>> I don't remember seeing anything about that before, but then again I
>> wasn't really looking for it until now.  Did you have a basic idea what
>> you wanted to do there?  I have enough of this work queued up now that I
>> may end up poking at the code myself here soon.
>
> Please see this thread:
> http://archives.postgresql.org/message-id/[email protected]
>
>> The first thing I was considering was dropping some DEBUG2 level logging
>> about the various phases of the reindex into there.
>
> That would take this system more or less to where autovacuum was in 8.1
> (well, not quite that bad because you can change it locally, but still).
> It seems to me we can do something a bit better than that.

It would be fairly straightforward to add some text to the end of the
ps message for long running DDL commands to say "phase 2" etc..
Or pg_stat_activity query text.

That way we would have some visibility into the progress of long
running DDL without too much effort.

-- 
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Improve warnings around CREATE INDEX CONCURRENTLY
  2011-05-24 07:56 Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-24 20:48 ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  2011-05-24 21:06   ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-25 17:57     ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
@ 2011-05-25 21:04       ` Greg Smith <[email protected]>
  2011-05-26 15:52         ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  1 sibling, 1 reply; 10+ messages in thread

From: Greg Smith @ 2011-05-25 21:04 UTC (permalink / raw)
  To: ; +Cc: pgsql-docs

On 05/25/2011 01:57 PM, Alvaro Herrera wrote:
> Please see this thread:
> http://archives.postgresql.org/message-id/[email protected]
>    

Ah, now I see--I was running a CommitFest that week, so of course I 
missed this message.  Those are all good ideas.

>> The first thing I was considering was dropping some DEBUG2 level logging
>> about the various phases of the reindex into there.
>>      
> That would take this system more or less to where autovacuum was in 8.1
> (well, not quite that bad because you can change it locally, but still).
> It seems to me we can do something a bit better than that.
>    

Sure.  Simon's command string idea might work better, and doing some 
extra lock decoration as you suggested in the above thread would be 
another level of improvement.  We should pick up redesign later on the 
main list.  You can at least count me in as someone who wants to see 
this improved now.

Back to the doc patch I submitted...is that a useful step toward making 
this issue visible enough to users for now to help?

-- 
Greg Smith   2ndQuadrant US    [email protected]   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Improve warnings around CREATE INDEX CONCURRENTLY
  2011-05-24 07:56 Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-24 20:48 ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  2011-05-24 21:06   ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-25 17:57     ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  2011-05-25 21:04       ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
@ 2011-05-26 15:52         ` Alvaro Herrera <[email protected]>
  2011-05-27 04:48           ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-06-13 16:11           ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Robert Haas <[email protected]>
  0 siblings, 2 replies; 10+ messages in thread

From: Alvaro Herrera @ 2011-05-26 15:52 UTC (permalink / raw)
  To: Greg Smith <[email protected]>; +Cc: pgsql-docs

Excerpts from Greg Smith's message of mié may 25 17:04:03 -0400 2011:

> Sure.  Simon's command string idea might work better, and doing some 
> extra lock decoration as you suggested in the above thread would be 
> another level of improvement.  We should pick up redesign later on the 
> main list.  You can at least count me in as someone who wants to see 
> this improved now.

Great

> Back to the doc patch I submitted...is that a useful step toward making 
> this issue visible enough to users for now to help?

Sure, why not?  I thought I could choose my bikeshed color while I was
here, how about

+    second and third transaction.  All active transactions at the time the
+    second table scan starts, not just ones that already involve the table,
+    have the potential to block the concurrent index creation until they
+    finish.  When checking for transactions that
+    could still use the original index, concurrent index creation advances
+    through potentially interfering older transactions one at a time,
+    obtaining shared locks on their virtual transaction identifiers to wait for
+    them to complete.


-- 
Álvaro Herrera <[email protected]>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support



^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Improve warnings around CREATE INDEX CONCURRENTLY
  2011-05-24 07:56 Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-24 20:48 ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  2011-05-24 21:06   ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-25 17:57     ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  2011-05-25 21:04       ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-26 15:52         ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
@ 2011-05-27 04:48           ` Greg Smith <[email protected]>
  1 sibling, 0 replies; 10+ messages in thread

From: Greg Smith @ 2011-05-27 04:48 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: pgsql-docs

On 05/26/2011 11:52 AM, Alvaro Herrera wrote:
> Sure, why not?  I thought I could choose my bikeshed color while I was
> here, how about
>    

That text is fine too.  I don't care exactly what color the bike shed 
is, I just want to get the bike out of the rain.

-- 
Greg Smith   2ndQuadrant US    [email protected]   Baltimore, MD





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Improve warnings around CREATE INDEX CONCURRENTLY
  2011-05-24 07:56 Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-24 20:48 ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  2011-05-24 21:06   ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-25 17:57     ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  2011-05-25 21:04       ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-26 15:52         ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
@ 2011-06-13 16:11           ` Robert Haas <[email protected]>
  2011-06-13 21:26             ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  1 sibling, 1 reply; 10+ messages in thread

From: Robert Haas @ 2011-06-13 16:11 UTC (permalink / raw)
  To: Alvaro Herrera <[email protected]>; +Cc: Greg Smith <[email protected]>; pgsql-docs

On Thu, May 26, 2011 at 11:52 AM, Alvaro Herrera
<[email protected]> wrote:
> Excerpts from Greg Smith's message of mié may 25 17:04:03 -0400 2011:
>
>> Sure.  Simon's command string idea might work better, and doing some
>> extra lock decoration as you suggested in the above thread would be
>> another level of improvement.  We should pick up redesign later on the
>> main list.  You can at least count me in as someone who wants to see
>> this improved now.
>
> Great
>
>> Back to the doc patch I submitted...is that a useful step toward making
>> this issue visible enough to users for now to help?
>
> Sure, why not?  I thought I could choose my bikeshed color while I was
> here, how about
>
> +    second and third transaction.  All active transactions at the time the
> +    second table scan starts, not just ones that already involve the table,
> +    have the potential to block the concurrent index creation until they
> +    finish.  When checking for transactions that
> +    could still use the original index, concurrent index creation advances
> +    through potentially interfering older transactions one at a time,
> +    obtaining shared locks on their virtual transaction identifiers to wait for
> +    them to complete.

Alvaro, did you commit this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Improve warnings around CREATE INDEX CONCURRENTLY
  2011-05-24 07:56 Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-24 20:48 ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  2011-05-24 21:06   ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-25 17:57     ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  2011-05-25 21:04       ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
  2011-05-26 15:52         ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Alvaro Herrera <[email protected]>
  2011-06-13 16:11           ` Re: Improve warnings around CREATE INDEX CONCURRENTLY Robert Haas <[email protected]>
@ 2011-06-13 21:26             ` Alvaro Herrera <[email protected]>
  0 siblings, 0 replies; 10+ messages in thread

From: Alvaro Herrera @ 2011-06-13 21:26 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Greg Smith <[email protected]>; pgsql-docs

Excerpts from Robert Haas's message of lun jun 13 12:11:48 -0400 2011:
> On Thu, May 26, 2011 at 11:52 AM, Alvaro Herrera
> <[email protected]> wrote:

> >> Back to the doc patch I submitted...is that a useful step toward making
> >> this issue visible enough to users for now to help?

> Alvaro, did you commit this?

I just did (9.0 and beyond), sorry for sitting on it for so long.

-- 
Álvaro Herrera <[email protected]>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support




^ permalink  raw  reply  [nested|flat] 10+ messages in thread


end of thread, other threads:[~2011-06-13 21:26 UTC | newest]

Thread overview: 10+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2011-05-24 07:56 Improve warnings around CREATE INDEX CONCURRENTLY Greg Smith <[email protected]>
2011-05-24 20:48 ` Alvaro Herrera <[email protected]>
2011-05-24 21:06   ` Greg Smith <[email protected]>
2011-05-25 17:57     ` Alvaro Herrera <[email protected]>
2011-05-25 18:24       ` Simon Riggs <[email protected]>
2011-05-25 21:04       ` Greg Smith <[email protected]>
2011-05-26 15:52         ` Alvaro Herrera <[email protected]>
2011-05-27 04:48           ` Greg Smith <[email protected]>
2011-06-13 16:11           ` Robert Haas <[email protected]>
2011-06-13 21:26             ` Alvaro Herrera <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox