MIME-Version: 1.0
References: <CAOYmi+kThkM9Z87u=R_Wi7fCor2i+UZKAyq0UCyprzCwTQvqgA@mail.gmail.com>
 <20240610200411.byj6sv2vpgol6wcf@awork3.anarazel.de>
In-Reply-To: <20240610200411.byj6sv2vpgol6wcf@awork3.anarazel.de>
From: Jelte Fennema-Nio <postgres@jeltef.nl>
Date: Mon, 10 Jun 2024 23:40:39 +0200
Message-ID: <CAGECzQQbcWXE8H27mFaV70fUqGwAp_8WCOOocH_C6ifzQ-WLSw@mail.gmail.com>
Subject: Re: RFC: adding pytest as a supported test framework
To: Andres Freund <andres@anarazel.de>
Cc: Jacob Champion <jacob.champion@enterprisedb.com>, 
	PostgreSQL Hackers <pgsql-hackers@postgresql.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://www.postgresql.org/message-id/CAGECzQQbcWXE8H27mFaV70fUqGwAp_8WCOOocH_C6ifzQ-WLSw%40mail.gmail.com>
Precedence: bulk

On Mon, 10 Jun 2024 at 20:46, Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> For the v18 cycle, I would like to try to get pytest [1] in as a
> supported test driver, in addition to the current offerings.

Huge +1 from me (but I'm definitely biased here)

> Thoughts? Suggestions?

I think the most important thing is that we make it easy for people to
use this thing, and use it in a "correct" way. I have met very few
people that actually like writing tests, so I think it's very
important to make the barrier to do so as low as possible.

For the PgBouncer repo I created my own pytest based test suite more
~1.5 years ago now. I tried to make it as easy as possible to write
tests there, and it has worked out quite well imho. I don't think it
makes sense to copy all things I did there verbatim, because some of
it is quite specific to testing PgBouncer. But I do think there's
quite a few things that could probably be copied (or at least inspire
what you do). Some examples:

1. helpers to easily run shell commands, most importantly setting
check=True by default[1]
2. helper to get a free tcp port[2]
3. helper to check if the log contains a specific string[3]
4. automatically show PG logs on test failure[4]
5. helpers to easily run sql commands (psycopg interface isn't very
user friendly imho for the common case)[5]
6. startup/teardown cleanup logic[6]

[1]: https://github.com/pgbouncer/pgbouncer/blob/3f791020fb017c570fcd2db390600a353f1cba0c/test/utils.py#L83-L131
[2]: https://github.com/pgbouncer/pgbouncer/blob/3f791020fb017c570fcd2db390600a353f1cba0c/test/utils.py#L210-L233
[3]: https://github.com/pgbouncer/pgbouncer/blob/3f791020fb017c570fcd2db390600a353f1cba0c/test/utils.py#L1125-L1143
[4]: https://github.com/pgbouncer/pgbouncer/blob/3f791020fb017c570fcd2db390600a353f1cba0c/test/utils.py#L1075-L1103
[5]: https://github.com/pgbouncer/pgbouncer/blob/3f791020fb017c570fcd2db390600a353f1cba0c/test/utils.py#L326-L338
[6]: https://github.com/pgbouncer/pgbouncer/blob/3f791020fb017c570fcd2db390600a353f1cba0c/test/utils.py#L546-L642


On Mon, 10 Jun 2024 at 22:04, Andres Freund <andres@anarazel.de> wrote:
> > Problem 1 (rerun failing tests): One architectural roadblock to this
> > in our Test::More suite is that tests depend on setup that's done by
> > previous tests. pytest allows you to declare each test's setup
> > requirements via pytest fixtures, letting the test runner build up the
> > world exactly as it needs to be for a single isolated test. These
> > fixtures may be given a "scope" so that multiple tests may share the
> > same setup for performance or other reasons.
>
> OTOH, that's quite likely to increase overall test times very
> significantly. Yes, sometimes that can be avoided with careful use of various
> features, but often that's hard, and IME is rarely done rigiorously.

You definitely want to cache things like initdb and "pg_ctl start".
But that's fairly easy to do with some startup/teardown logic. For
PgBouncer I create a dedicated schema for each test that needs to
create objects and automatically drop that schema at the end of the
test[6] (including any temporary objects outside of schemas like
users/replication slots). You can even choose not to clean up certain
large schemas if they are shared across multiple tests.

[6]: https://github.com/pgbouncer/pgbouncer/blob/3f791020fb017c570fcd2db390600a353f1cba0c/test/utils.py#L546-L642

> > Problem 2 (seeing what failed): pytest does this via assertion
> > introspection and very detailed failure reporting. If you haven't seen
> > this before, take a look at the pytest homepage [1]; there's an
> > example of a full log.
>
> That's not really different than what the perl tap test stuff allows. We
> indeed are bad at utilizing it, but I'm not sure that switching languages will
> change that.

It's not about allowing, it's about doing the thing that you want by
default. The following code

assert a == b

will show you the actual values of both a and b when the test fails,
instead of saying something like "false is not true". Ofcourse you can
provide a message here too, like with perl its ok function, but even
when you don't the output is helpful.

> I think part of the problem is that the information about what precisely
> failed is often much harder to collect when testing multiple servers
> interacting than when doing localized unit tests.
>
> I think we ought to invest a bunch in improving that, I'd hope that a lot of
> that work would be largely independent of the language the tests are written
> in.

Well, as you already noted no-one that started doing dev stuff in the
last 10 years knows Perl nor wants to learn it. So a large part of the
community tries to touch the current perl test suite as little as
possible. I personally haven't tried to improve anything about our
perl testing framework, even though I'm normally very much into
improving developer tooling.


> > Python's standard library has lots of power by itself, with very good
> > documentation. And virtualenvs and better package tooling have made it
> > much easier, IMO, to avoid the XKCD dependency tangle [4] of the
> > 2010s.
>
> Ugh, I think this is actually python's weakest area. There's about a dozen
> package managers and "python distributions", that are at best half compatible,
> and the documentation situation around this is *awful*.

I definitely agree this is Python its weakest area. But since venv is
part of the python standard library it's much better. I have the
following short blurb in PgBouncer its test README[7] and it has
worked for all contributors so far:

# create a virtual environment (only needed once)
python3 -m venv env

# activate the environment. You will need to activate this environment in
# your shell every time you want to run the tests. (so it's needed once per
# shell).
source env/bin/activate

# Install the dependencies (only needed once, or whenever extra dependencies
# get added to requirements.txt)
pip install -r requirements.txt


[7]: https://github.com/pgbouncer/pgbouncer/blob/master/test/README.md

> I think somewhere between 1 and 4 a *substantial* amount of work would be
> required to provide a bunch of the infrastructure that Cluster.pm etc
> provide. Otherwise we'll end up with a lot of copy pasted code between tests.

Totally agreed, that we should have a fairly decent base to work on
top of. I think we should at least port a few tests to show that the
base has at least the most basic functionality.