pgjdbc/pgjdbc GitHub issues and pull requests (mirror)  
help / color / mirror / Atom feed
From: ringerc (@ringerc) <[email protected]>
To: pgjdbc/pgjdbc <[email protected]>
Subject: [pgjdbc/pgjdbc] issue #194: PgJDBC can experience client/server deadlocks during batch execution
Date: Thu, 02 Oct 2014 13:03:53 +0000
Message-ID: <[email protected]> (raw)

PgJDBC can encounter client/server deadlocks during batch execution, where the server is waiting for the client and the client is waiting for the server. Neither can progress and one must be terminated.

The client cannot continue until the server consumes some input from the server's receive buffer (the client's send buffer).

The server cannot continue until the client consumes some input from the client's receive buffer, (the server's send buffer).

Each is blocked trying to send to the other. Neither can receive until the other sends.

PgJDBC tries to prevent this case from arising with some heuristics in its batch facilities where it attempts to limit the number of queries that may be queued; see `org.postgresql.core.v3.QueryExecutorImpl`, and the comments around `MAX_BUFFERED_QUERIES`. The coarse heuristic of assuming 250 bytes per server reply and a 64kb server send buffer can be defeated by large numbers of asynchronous messages like `NOTIFY`s or non-fatal `RAISE` messages. It was introduced in commit c1a939f6, with a followup commit 985c0472 restricting batching to queries that don't return generated keys.

The main reason that deadlocks are rare is that the 64k buffer size is now unrealistically small; on my Linux system, default buffers are 200kb for both send and receive, giving us 400kb of buffer space to work with.

I've produced a very artificial test case showing that a deadlock can still occur; see [`TestDeadlock.java`](https://github.com/ringerc/pgjdbc-batchtest/blob/master/src/com/secondquadrant/jdbcbatchtest/TestDea...) in https://github.com/ringerc/pgjdbc-batchtest . While that test is very heavy handed in producing the deadlock, real world cases can and do arise. 

The client's stack looks like:

```
SocketOutputStream.socketWrite(byte[], int, int) line: 113  
SocketOutputStream.write(byte[], int, int) line: 159    
BufferedOutputStream.write(byte[], int, int) line: 122  
BufferedOutputStream(FilterOutputStream).write(byte[]) line: 97 
PGStream.Send(byte[]) line: 224 
SimpleParameterList.writeV3Value(int, PGStream) line: 344   
QueryExecutorImpl.sendBind(SimpleQuery, SimpleParameterList, Portal, boolean) line: 1387    
QueryExecutorImpl.sendOneQuery(SimpleQuery, SimpleParameterList, int, int, int) line: 1622  
QueryExecutorImpl.sendQuery(V3Query, V3ParameterList, int, int, int, QueryExecutorImpl$ErrorTrackingResultHandler) line: 1137   
QueryExecutorImpl.execute(Query[], ParameterList[], ResultHandler, int, int, int) line: 396 
Jdbc4PreparedStatement(AbstractJdbc2Statement).executeBatch() line: 2897
```

The server's stack looks something like:

```
#0  0x0000003bc68f5dad in __libc_send (fd=10, buf=0x20eee90, n=8192, flags=-1, flags@entry=0) at ../sysdeps/unix/sysv/linux/x86_64/send.c:27
#1  0x00000000005a6c47 in secure_write (port=<optimized out>, ptr=ptr@entry=0x20eee90, len=len@entry=8192) at be-secure.c:396
#2  0x00000000005b008b in internal_flush () at pqcomm.c:1252
#3  0x00000000005b01dd in internal_putbytes (
    s=0x2b452a1 "abcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefgh"..., s@entry=0x2ae52c0 "SWARNING", len=655444) at pqcomm.c:1198
#4  0x00000000005b11f0 in pq_putmessage (msgtype=78 'N', s=0x2ae52c0 "SWARNING", len=<optimized out>) at pqcomm.c:1395
#5  0x00000000005b1e64 in pq_endmessage (buf=buf@entry=0x7fff12cf9300) at pqformat.c:346
#6  0x0000000000721ff7 in send_message_to_frontend (edata=<optimized out>) at elog.c:2887
#7  EmitErrorReport () at elog.c:1429
#8  0x000000000072190d in errfinish (dummy=<optimized out>) at elog.c:488
#9  0x00007fc526f1f47b in exec_stmt_raise (estate=estate@entry=0x7fff12cf9780, stmt=stmt@entry=0x21eecb0) at pl_exec.c:3027
#10 0x00007fc526f20e55 in exec_stmt (stmt=0x21eecb0, estate=0x7fff12cf9780) at pl_exec.c:1442
#11 exec_stmts (estate=0x7fff12cf9780, stmts=<optimized out>) at pl_exec.c:1345
#12 0x00007fc526f22d1b in exec_stmt_block (estate=estate@entry=0x7fff12cf9780, block=0x21ef758) at pl_exec.c:1283
#13 0x00007fc526f237c3 in plpgsql_exec_trigger (func=func@entry=0x21e9310, trigdata=0x7fff12cf9f30) at pl_exec.c:689
#14 0x00007fc526f18b3a in plpgsql_call_handler (fcinfo=0x7fff12cf9ac0) at pl_handler.c:120
#15 0x00000000005672c7 in ExecCallTriggerFunc (trigdata=trigdata@entry=0x7fff12cf9f30, tgindx=tgindx@entry=0, finfo=finfo@entry=0x21eb240, instr=instr@entry=0x0, 
    per_tuple_context=per_tuple_context@entry=0x21e90f0) at trigger.c:1871
#16 0x000000000056857a in AfterTriggerExecute (trigdesc=0x21eb320, trigdesc=0x21eb320, per_tuple_context=0x21e90f0, instr=0x0, finfo=0x21eb240, rel=0x7fc527133858, 
    event=0x21e7100) at trigger.c:3454
#17 afterTriggerInvokeEvents (events=events@entry=0x21c1420, firing_id=4, estate=estate@entry=0x21eb100, delete_ok=delete_ok@entry=1 '\001') at trigger.c:3633
#18 0x000000000056c8be in AfterTriggerEndQuery (estate=estate@entry=0x21eb100) at trigger.c:3823
#19 0x000000000057f3dc in standard_ExecutorFinish (queryDesc=0x21d0998) at execMain.c:379
#20 0x0000000000658fe8 in ProcessQuery (plan=<optimized out>, sourceText=0x21d08a0 "INSERT INTO deadlock_demo1(id, largetext) VALUES ($1,$2)", params=0x21d0908, 
    dest=0xb1a160 <donothingDR>, completionTag=0x7fff12cfa420 "INSERT 0 1") at pquery.c:225
#21 0x000000000065918c in PortalRunMulti (portal=portal@entry=0x21b1060, isTopLevel=isTopLevel@entry=1 '\001', dest=0xb1a160 <donothingDR>, dest@entry=0x217f1a0, 
    altdest=0xb1a160 <donothingDR>, altdest@entry=0x217f1a0, completionTag=completionTag@entry=0x7fff12cfa420 "INSERT 0 1") at pquery.c:1279
#22 0x0000000000659c7c in PortalRun (portal=0x21b1060, count=1, isTopLevel=<optimized out>, dest=0x217f1a0, altdest=0x217f1a0, completionTag=0x7fff12cfa420 "INSERT 0 1")
    at pquery.c:816
#23 0x00000000006573bc in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:1931
#24 0x000000000045d9ad in BackendRun (port=0x20ed950) at postmaster.c:4011
#25 BackendStartup (port=0x20ed950) at postmaster.c:3685
#26 ServerLoop () at postmaster.c:1586
#27 0x00000000006161f3 in PostmasterMain (argc=argc@entry=5, argv=argv@entry=0x20c72f0) at postmaster.c:1253
#28 0x000000000045e558 in main (argc=5, argv=0x20c72f0) at main.c:226
```


view thread (36+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: github://pgjdbc/pgjdbc
  Cc: [email protected], [email protected]
  Subject: Re: [pgjdbc/pgjdbc] issue #194: PgJDBC can experience client/server deadlocks during batch execution
  In-Reply-To: <<[email protected]>>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox