Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sRbCG-00Gry8-Pn for pgsql-general@arkaria.postgresql.org; Wed, 10 Jul 2024 17:33:40 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sRbCE-007gNS-Af for pgsql-general@arkaria.postgresql.org; Wed, 10 Jul 2024 17:33:38 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sRbCD-007gMZ-Su for pgsql-general@lists.postgresql.org; Wed, 10 Jul 2024 17:33:37 +0000 Received: from mail-ed1-x52b.google.com ([2a00:1450:4864:20::52b]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sRbC9-001QV4-Iz for pgsql-general@lists.postgresql.org; Wed, 10 Jul 2024 17:33:35 +0000 Received: by mail-ed1-x52b.google.com with SMTP id 4fb4d7f45d1cf-57cc310bca3so966a12.0 for ; Wed, 10 Jul 2024 10:33:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gautherot.net; s=google; t=1720632812; x=1721237612; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=nMpiLRDfpY+n3e/PPQz4PUX+Y329uSeKogKVqB+bBWg=; b=FKLM5MU84k5tJZBQS2XRAoEOemOP5D4NAS/7NR+/3ReZS0VfVp4LELg+dDGptApGaw T1Av4jX+lS1SI7yJ1V+H1BV7ALIlrYbnjydwST4vROcqEhsIw1RRL2WRDlY5oASO3KzZ tRu7To3bjrf/AWWFQUKHl9MN02zwpezWUHtESO9qJYvUrz/0gWtanzCPQxyZANYHkWI9 /SzZBzr9TZJaZfXfaURrWLn1Pni6B/itZqDV50ZdzrA63lYX6MD7Wd7NsOdgvUx3GFMm AH+cB5SlCpjsJ4+7fsPtVgec0OReDXa/8MoETD4k5h61Bet23zz9TfR9kTZcKg4ciQnE 5hgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720632812; x=1721237612; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=nMpiLRDfpY+n3e/PPQz4PUX+Y329uSeKogKVqB+bBWg=; b=gzO77VV+NwnTmMn3n+ydGKEBx24V1/Rv6Uo+w8j3t6DJLM7ph4oWH1TeEjPyySWNsq yAiWAAdNtJ8c6RGqAs/CaBPTH6qvzULEydewDn4LdNtHthSgaKhiO6kGucNKEGq5u/6S 1H+JFavjYYSCWrcu3SY5V1l2RzahL4I5Ydw3AB1PwBziO2E53WrnuSdovYnZqxFmhwGL Smm7lJWbsIrBQlX6y6WufBCjNOf3v5BMnPbgo6XUQVEkwoqydsZLWKWvqciCi7s9r6h/ 9eAU9E1rrXlH7QN1bA4GCM+ypR+1IVYL4OPm7LOIPwY9UGV4n4n6Cyl5tqmxW5P9k9nA hrNA== X-Gm-Message-State: AOJu0YzI5AKr+lKqzlYiTihWf0jUjRShT9vbTRsws+ec9nl/P4g+bIkb xv2yqp0uPjSygdacqytky5O5XMKUNjMCeFNjf6MzgqAT1IGRl/HNfGfJKV10j+OO/7vGWCirVZq XrtW2ftK2WDwDUepnvny+VnX71mq49lMk7OkZ X-Google-Smtp-Source: AGHT+IGvnhMvwnH+KIBvJnBK3YSAWqEwwsoPrEm3WuLjZqNzDkhgyo/OBMXv/QRLQcdh4Gv+yvvmwq+bKBGeRYJzp14= X-Received: by 2002:a05:6402:3cb:b0:578:6959:9d36 with SMTP id 4fb4d7f45d1cf-5982576a820mr33862a12.0.1720632811960; Wed, 10 Jul 2024 10:33:31 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Olivier Gautherot Date: Wed, 10 Jul 2024 19:33:19 +0200 Message-ID: Subject: Re: can stored procedures with computational sql queries improve API performance? To: Krishnakant Mane Cc: pgsql-general@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000008bdbbf061ce80b24" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000008bdbbf061ce80b24 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Krishnkant, On Wed, Jul 10, 2024 at 2:58=E2=80=AFAM Krishnakant Mane wrote: > Hello. > > I have a straight forward question, but I am just trying to analyze the > specifics. > > So I have a set of queries depending on each other in a sequence to > compute some results for generating financial report. > > It involves summing up some amounts from tuns or of rows and also on > certain conditions it categorizes the amounts into types (aka Debit > Balance, Credit balance etc). > > There are at least 6 queries in this sequence and apart from 4 input > parameters. these queries never change. > > So will I get any performance benefit by having them in a stored > procedure rather than sending the queries from my Python based API? > > Regards. > Functions and procedures have a significant potential to improve performance but there are a few things to watch. Return of experience after having written a few hundreds for a project. The list below is by no mean not exhaustive. 1) Warning: Compatibility with other database engines If your model needs to run on other technologies (Oracle, MySQL, MS-SQL, etc.), the increase in maintenance efforts may become prohibitive 2) Warning: deadlocks Make sure to design the procedures in such a way that you don't mutually depend on other parallel invocations 3) WARNING: Don't overload the server Try to remain conservative in terms of computation in your server. Try to stick to selects, updates, deletes, joins, and simple arithmetics and strings manipulations. You can do a lot more but it may quickly affect the overall performance. The reasons why I would recommend to use them: A) Reduction of round trips Even though it may be a few ms at a time, it can add up and become significant. This is where you gain performance. B) Implicit transactions A function will not free locks until it returns. This means that if your queries depend on cells, or modify cells, the behavior will be coherent, reducing the risk of race conditions. If other invocations depend on the same data, the locks will take care of sequencing execution to maintain integrity. In other words, you can safely invoke functions in parallel and let the engine do the scheduling when necessary. C) API evolution As long as you have only 1 or 2 applications running against the database, it may not be an issue. If you have more and your model needs to evolve, you may get to a situation where updating them all at the same time can become a challenge, especially if you depend on external providers. By using procedures and functions, you can abstract the model and maintain a standard interface to the application. Note: * I DON'T RECOMMEND IT * but in some cases it can be handy to have the same function name with different sets of parameters (for instance to present a wrapper with default parameters, and other instances with a finer, more complete control). It can happen if you don't clean up timely older versions of the API when you upgrade your model - and it can become a nightmare. Last recommendation: activate the logs and review regularly the performance of your functions. You may identify occurrences that run very fast and others not so. It can help you identify potential conflicts or model optimizations. Hope it helps -- Olivier Gautherot --0000000000008bdbbf061ce80b24 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Krishnkant,

On Wed, Jul 10,= 2024 at 2:58=E2=80=AFAM Krishnakant Mane <kkproghub@gmail.com> wrote:
Hello.

I have a straight forward question, but I am just trying to analyze the specifics.

So I have a set of queries depending on each other in a sequence to
compute some results for generating financial report.

It involves summing up some amounts from tuns or of rows and also on
certain conditions it categorizes the amounts into types (aka Debit
Balance, Credit balance etc).

There are at least 6 queries in this sequence and apart from 4 input
parameters. these queries never change.

So will I get any performance benefit by having them in a stored
procedure rather than sending the queries from my Python based API?

Regards.

Functions and procedures have = a significant potential to improve performance but there are a few things t= o watch. Return of experience after having written a few hundreds for a pro= ject. The list below is by no mean not exhaustive.

1) Warning: Compatibility with other database engines
If your mo= del needs to run on other technologies (Oracle, MySQL, MS-SQL, etc.), the i= ncrease in maintenance efforts may become prohibitive

<= div>2) Warning: deadlocks
Make sure to design the procedures in s= uch a way that you don't mutually depend on other parallel invocations<= /div>

3) WARNING: Don't overload the server
Try to remain conservative in terms of computation in your server. Try to= stick to selects, updates, deletes, joins, and simple arithmetics and stri= ngs manipulations. You can do a lot more but it may quickly affect the over= all performance.

The reasons why I would recommend= to use them:

A) Reduction of round trips
Even though it may be a few ms at a time, it can add up and become signif= icant. This is where you gain performance.

B) Impl= icit transactions
A function will not free locks until it returns= . This means that if your queries depend on cells, or modify cells, the beh= avior will be coherent, reducing the risk of race conditions. If other invo= cations=C2=A0depend on the same data, the locks will take care of sequencin= g execution to maintain integrity. In other words, you can safely invoke fu= nctions in parallel and let the engine do the scheduling when necessary.

C) API evolution
As long as you have only = 1 or 2 applications running against the database, it may not be an issue. I= f you have more and your model needs to evolve, you may get to a situation = where updating them all at the same time can become a challenge, especially= if you depend on external providers. By using procedures and functions, yo= u can abstract the model and maintain a standard interface to the applicati= on.

Note: * I DON'T RECOMMEND IT * but in some= cases it can be handy to have the same function name with different sets o= f parameters (for instance to present a wrapper with default parameters, an= d other instances with a finer, more complete control). It can happen if yo= u don't clean up timely older versions of the API when you upgrade your= model - and it can become a nightmare.

Last recom= mendation: activate the logs and review regularly the performance of your f= unctions. You may identify occurrences=C2=A0that run very fast and others n= ot so. It can help you identify potential conflicts or model optimizations.=

Hope it helps
--
Olivier Gautherot

=C2=A0
--0000000000008bdbbf061ce80b24--