Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sH2UM-002MOl-7R for pgsql-hackers@arkaria.postgresql.org; Tue, 11 Jun 2024 14:28:43 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sH2UJ-006zLM-1D for pgsql-hackers@arkaria.postgresql.org; Tue, 11 Jun 2024 14:28:39 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sH2UI-006zKv-IO for pgsql-hackers@lists.postgresql.org; Tue, 11 Jun 2024 14:28:39 +0000 Received: from mail-qv1-xf30.google.com ([2607:f8b0:4864:20::f30]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sH2UF-000o0D-K4 for pgsql-hackers@postgresql.org; Tue, 11 Jun 2024 14:28:37 +0000 Received: by mail-qv1-xf30.google.com with SMTP id 6a1803df08f44-6ae259b1c87so56296356d6.1 for ; Tue, 11 Jun 2024 07:28:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb.com; s=google; t=1718116114; x=1718720914; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=X2GcdMlmvIqaEkCaA0EFwnBqgrRbL+M2wbqo5mUFXXs=; b=XZ7VoiTWCJGZAJFb8avKLS5g9B5lDRkVeeMqhHEb+3N42z5lVVem2vwU+zDl/rW0bW Nca/R3DKi4ErYATJFLTtDnxrM5vz2OJEUWI5kLU66idBR5xTGj9kMq/44zzZijpxDvMr EKQUD0SVkh6z6EBjzfM/3TgP4qHo2m4pVuJ+LObYl80ysDm/33XD9N9lpxjA0q8D/Qyq uYa2DjV5oA+WJC1XY821bD7IiQ5zL5ptS5V8pKJNuFo31zfthdH3DPYaGJgcK8HpVCl5 1XnZsg2m9z3wZ/+X0+B+3+3mRD5QT7oO262vjcjDxA9TQxOG5H4BwQrFFEUtgigpK551 SKSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718116114; x=1718720914; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=X2GcdMlmvIqaEkCaA0EFwnBqgrRbL+M2wbqo5mUFXXs=; b=vf2Kzb84sVqykFQ1LvnepBFR/j/nm/R8VhedQXqly3/OZW/x0m7aQkI9hehrB79Txd rcahyjjH7yNzWx2teNyPoEjV7UyI88CS19NwKy6ePsB9Z4sPPpIQgkqq6RZ+Rr4ujWMM y6CRAdsul/76g6mjuDSVNIu4jmzeVNQkHpjlHaOtznmB5tMyXQFEm5TSCJ0NJBOZfw9u MUGY0NJ24uAqCs8UXmULBbV3Uhys8jLbrSACC8HM91buaQUkkqPh49Lhaor9wiYg1AF/ 4MzO9sL8Pv10GVncjRPx3MXkMgqjJRrjtVNxBQRAnHfsGNpIV4PYjEz8SZ1rII2tWpvg QRSQ== X-Gm-Message-State: AOJu0Yyr/P1QDI3YHym3lEK5vwuvqs4z66Ll+gPohOuieRTQya6wqWlo ndQJtCXRKSU/jRMIA4jGSqaRw/74xmrOLo39FkVBXtxZVFwXacq0W8UWZDrjQQedmcwh6mzBQT0 0fy14xG1B0tMbulV+8xY/g8RpDo98ARvIYBFBYNPhV05EPo+Uyw== X-Google-Smtp-Source: AGHT+IFODvRHr7NsWDSMr+Ye2RXUWQtANIDFaMgEsel6Jy3O+o0ISxP3iPlpG1uMMXJC1IERsueEnkE2N99ZMaFkHco= X-Received: by 2002:a05:6214:194b:b0:6b0:6625:135 with SMTP id 6a1803df08f44-6b08a08b198mr49653496d6.28.1718116114158; Tue, 11 Jun 2024 07:28:34 -0700 (PDT) MIME-Version: 1.0 References: <20240610200411.byj6sv2vpgol6wcf@awork3.anarazel.de> In-Reply-To: <20240610200411.byj6sv2vpgol6wcf@awork3.anarazel.de> From: Jacob Champion Date: Tue, 11 Jun 2024 07:28:23 -0700 Message-ID: Subject: Re: RFC: adding pytest as a supported test framework To: Andres Freund Cc: PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, Jun 10, 2024 at 1:04=E2=80=AFPM Andres Freund = wrote: > Just for context for the rest the email: I think we desperately need to m= ove > off perl for tests. The infrastructure around our testing is basically > unmaintained and just about nobody that started doing dev stuff in the la= st 10 > years learned perl. Okay. Personally, I'm going to try to stay out of discussions around subtracting Perl and focus on adding Python, for a bunch of different reasons: - Tests aren't cheap, but in my experience, the maintenance-cost math for tests is a lot different than the math for implementations. - I don't personally care for Perl, but having tests in any form is usually better than not having them. - Trying to convince people to get rid of X while adding Y is a good way to make sure Y never happens. > On 2024-06-10 11:46:00 -0700, Jacob Champion wrote: > > 4. It'd be great to split apart client-side tests from server-side > > tests. Driving Postgres via psql all the time is fine for acceptance > > testing, but it becomes a big problem when you need to test how > > clients talk to servers with incompatible feature sets, or how a peer > > behaves when talking to something buggy. > > That seems orthogonal to using pytest vs something else? Yes, I think that's fair. It's going to be hard not to talk about "things that pytest+Python don't give us directly but are much easier to build" in all of this (and I tried to call that out in the next section, maybe belatedly). I think I'm going to have to convince both a group of people who want to ask "why pytest in particular?" and a group of people who ask "why isn't what we have good enough?" > > =3D=3D Why pytest? =3D=3D > > > > From the small and biased sample at the unconference session, it looks > > like a number of people have independently settled on pytest in their > > own projects. In my opinion, pytest occupies a nice space where it > > solves some of the above problems for us, and it gives us plenty of > > tools to solve the other problems without too much pain. > > We might be able to alleviate that by simply abstracting it away, but I f= ound > pytest's testrunner pretty painful. Oodles of options that are not very w= ell > documented and that often don't work because they are very specific to so= me > situations, without that being explained. Hm. There are a bunch of them, but I've never needed to go through the oodles of options. Anything in particular that caused problems? > > Problem 1 (rerun failing tests): One architectural roadblock to this > > in our Test::More suite is that tests depend on setup that's done by > > previous tests. pytest allows you to declare each test's setup > > requirements via pytest fixtures, letting the test runner build up the > > world exactly as it needs to be for a single isolated test. These > > fixtures may be given a "scope" so that multiple tests may share the > > same setup for performance or other reasons. > > OTOH, that's quite likely to increase overall test times very > significantly. Yes, sometimes that can be avoided with careful use of var= ious > features, but often that's hard, and IME is rarely done rigiorously. Well, scopes are pretty front and center when you start building pytest fixtures, and the complicated longer setups will hopefully converge correctly early on and be reused everywhere else. I imagine no one wants to build cluster setup from scratch. On a slight tangent, is this not a problem today? I mean... part of my personal long-term goal is in increasing test hygiene, which is going to take some shifts in practice. As long as review keeps the quality of the tests fairly high, I see the inevitable "our tests take too long" problem as a good one. That's true no matter what framework we use, unless the framework is so bad that no one uses it and the runtime is trivial. If we're worried that people will immediately start exploding the runtime and no one will notice during review, maybe we can have some infrastructure flag how much a patch increased it? > > Problem 2 (seeing what failed): pytest does this via assertion > > introspection and very detailed failure reporting. If you haven't seen > > this before, take a look at the pytest homepage [1]; there's an > > example of a full log. > > That's not really different than what the perl tap test stuff allows. We > indeed are bad at utilizing it, but I'm not sure that switching languages= will > change that. Jelte already touched on this, but I wanted to hammer on the point: If no one, not even the developers who chose and like Perl, is using Test::More in a way that's maintainable, I would prefer to use a framework that does maintainable things by default so that you have to try really hard to screw it up. It is possible to screw up `assert actual =3D=3D expected`, but it takes more work than doing it the right way. > I think part of the problem is that the information about what precisely > failed is often much harder to collect when testing multiple servers > interacting than when doing localized unit tests. > > I think we ought to invest a bunch in improving that, I'd hope that a lot= of > that work would be largely independent of the language the tests are writ= ten > in. We do a lot more acceptance testing than internal testing, which came up as a major complaint from me and others during the unconference. One of the reasons people avoid writing internal tests in Perl is because it's very painful to find a rhythm with Test::More. From experience test-driving the OAuth work, I'm *very* happy with the development cycle that pytest gave me. Other languages _could_ do that, sure. It's a simple matter of programming.= .. > Ugh, I think this is actually python's weakest area. There's about a doze= n > package managers and "python distributions", that are at best half compat= ible, > and the documentation situation around this is *awful*. So... don't support the half-compatible stuff? I thought this conversation was still going on with Windows Perl (ActiveState? Strawberry?) but everyone just seems to pick what works for them and move on to better things to do. Modern CPython includes pip and venv. Done. If someone comes to us with some horrible Anaconda setup wanting to know why their duct tape doesn't work, can't we just tell them no? > > When it comes to third-party packages, which I think we're > > probably going to want in moderation, we would still need to discuss > > supply chain safety. Python is not as mature here as, say, Go. > > What external dependencies are you imagining? The OAuth pytest suite makes extensive use of - psycopg, to easily drive libpq; - construct, for on-the-wire packet representations and manipulation; and - pyca/cryptography, for easy generation of certificates and manual crypto testing. I'd imagine each would need considerable discussion, if there is interest in doing the same things that I do with them. > I think somewhere between 1 and 4 a *substantial* amount of work would be > required to provide a bunch of the infrastructure that Cluster.pm etc > provide. Otherwise we'll end up with a lot of copy pasted code between te= sts. Possibly, yes. I think it depends on what you want to test first, and there's a green-field aspect of hope/anxiety/ennui, too. Are you trying to port the acceptance-test framework that we already have, or are you trying to build a framework that can handle the things we can't currently test? Will it be easier to refactor duplication into shared fixtures when the language doesn't encourage an infinite number of ways to do things? Or will we have to keep on top of it to avoid pain? --Jacob